DAPLAB – Data Analysis and Processing Lab http://daplab.ch Reduces the entry barrier for companies to find value out of their data and ultimately turn into a data-driven company Tue, 10 Oct 2017 20:24:41 +0000 en-US hourly 1 https://wordpress.org/?v=5.6.10 http://daplab.ch/wp-content/uploads/2017/06/cropped-daplab-favicon-1-32x32.png DAPLAB – Data Analysis and Processing Lab http://daplab.ch 32 32 Workshop Data Science 19-20 October 2017 http://daplab.ch/2017/10/10/workshop-data-science-19-20-october-2017/ Tue, 10 Oct 2017 20:24:41 +0000 http://daplab.ch/?p=310 Dear DAPLAB’ers, there are still some places available for the workshop on data science co-organized by eXascale Infolab (UNIFR) and iCoSys (HEIA-FR). When: afternoons of 19 and 20th of October 2017 Where: UNIFR / HEIA-FR What (in french): Ce workshop organisé par IT-Valley s’adresse à tout informaticien désirant découvrir l’univers du Big Data et de la […]

The post Workshop Data Science 19-20 October 2017 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Dear DAPLAB’ers, there are still some places available for the workshop on data science co-organized by eXascale Infolab (UNIFR) and iCoSys (HEIA-FR).

When: afternoons of 19 and 20th of October 2017

Where: UNIFR / HEIA-FR

What (in french): Ce workshop organisé par IT-Valley s’adresse à tout informaticien désirant découvrir l’univers du Big Data et de la Data Science. Au programme, deux journées jalonnées d’exercices pratiques autour des technologies Hadoop et des frameworks d’analyse distribués.  Une partie des exercices se feront sur l’infrastructure DAPLAB

Registration:  https://goo.gl/uQD99T

The post Workshop Data Science 19-20 October 2017 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
DAPLAB updated to HDP 2.6.1 http://daplab.ch/2017/08/04/293/ Fri, 04 Aug 2017 14:40:32 +0000 http://daplab.ch/?p=293 Dear users and friends, Thanks to a great team effort, the DAPLAB stack has been updated to the latest version. We are now running HDP 2.6.1 ! The following technologies have also been upgraded in the process: HDP 2.6.1 Ambari 2.5.2 Spark 1.6.2 Spark2 2.1.1 Kafka 0.10.1 Cassandra 3.11 Flink 1.4 We hope you will […]

The post DAPLAB updated to HDP 2.6.1 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Dear users and friends,

Thanks to a great team effort, the DAPLAB stack has been updated to the latest version. We are now running HDP 2.6.1 !

The following technologies have also been upgraded in the process:

HDP 2.6.1
Ambari 2.5.2
Spark 1.6.2
Spark2 2.1.1
Kafka 0.10.1
Cassandra 3.11
Flink 1.4

We hope you will enjoy the new features that are now available,

happy coding !

The post DAPLAB updated to HDP 2.6.1 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Google #HashCode 2017 — Hub @HEIA-FR http://daplab.ch/2017/01/19/google-hashcode-2017-hub-eia-fr/ Thu, 19 Jan 2017 06:57:54 +0000 http://daplab.ch/?p=255 We’re pretty delighted to announce that DAPLAB in collaboration with iCoSys and GDG Fribourg will again this year host an official Hub for the next Google #HashCode, Thursday Feb 23. Since Thursdays is our hacky day, let’s code something different (or not, we’ll see :)). When Thursday, February 23. Doors open around 5:30pm, and will close around […]

The post Google #HashCode 2017 — Hub @HEIA-FR appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We’re pretty delighted to announce that DAPLAB in collaboration with iCoSys and GDG Fribourg will again this year host an official Hub for the next Google #HashCode, Thursday Feb 23. Since Thursdays is our hacky day, let’s code something different (or not, we’ll see :)).

When Thursday, February 23. Doors open around 5:30pm, and will close around midnight.
What Google #HashCode Contest
Where Room C00.11/15, Haute Ecole d’Ingénierie et d’Architecture de Fribourg, Boulevard de Pérolles 80, Fribourg.
How Team self-organization. Mind registering before Feb 20, and specify Fribourg’s Hub hub in your registration! Power and network connection provided.

We’ll open the room C00.11/15 at Haute Ecole d’Ingénierie et d’Architecture de Fribourg (HEIA-FR) at 5:30pm to give time to warmup our laptops and environments. Signs will be put in place to make your way safely to the room. The official contest starts at 6:30 pm and last for 4 hours. Drinks will be available for participants.

Please reach out in Hipchat or @DAPLABCH for more details. Teams can be formed on Thursday, but you MUST register before Feb 20. Please mention you’ll participate through the Fribourg’s Hub, this will help our logistic!

The post Google #HashCode 2017 — Hub @HEIA-FR appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Hacky Thursdays — Back to school Fall 2016 http://daplab.ch/2016/08/29/hacky-thursdays-back-school-fall-2016/ Mon, 29 Aug 2016 13:22:12 +0000 http://daplab.ch/?p=229 Summer is flying. It’s almost dark again outside when it’s Hacky Thursday’s time 🙂 After this rather long long summer break, it’s time to go back to school. The announcements of the summer was legions, and many, many, many new projects have been given to the Apache Software Foundation (ASF). I did challenge anyone of you […]

The post Hacky Thursdays — Back to school Fall 2016 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Summer is flying. It’s almost dark again outside when it’s Hacky Thursday’s time 🙂 After this rather long long summer break, it’s time to go back to school. The announcements of the summer was legions, and many, many, many new projects have been given to the Apache Software Foundation (ASF). I did challenge anyone of you to name 9 different stream processing framework/platform/engines now. Yes 9, because there are at least 9 of them in the ASF at the time of writing.

Hence, the menu proposed till the end of the year will be:

(Apache) Streams Processing WTF?

If you ever know the answer of this question, you’re more then welcome to contribute to one or another session 🙂

If, like me, you keep stacking articles about these technologies and you promised yourself in your new year resolutions to hack around, it’s not too late to achieve this goal.

In short, we’ll build a use case to try all of these technologies and compare them using requirements we’ll set in advance, such as ramp up time, ease of deploy, integration with YARN, debugging, performance, etc…

We’ll obviously use Kafka as dispatcher, so we’ll dig into Kafka before starting the heavy duty so that everyone is on the same page.

The planning is given below. It might be subject to few changes but you get the main idea.

  • September 1: Warm up — use case definition and event generator
  • September 8: DAPLAB/iCoSys/eXascale Barbecue
  • September 15: Kafka and Kafka internals
  • September 22: Spark
  • September 29: Flink
  • October 6: Apex
  • October 13: Kafka Streams
  • October 20: Fall’s break
  • October 27: Storm
  • November 3: Beam
  • November 10: Samza
  • November 17: Gearpump
  • November 24: Ignite
  • December 1:Recap. We’ll pick one (called winner below) and spend the December digging deeper into this winner
  • December 8: winner 1/2
  • December 15: winner 2/2
  • December 22: Fondue

If you know one of this technology and like to take ownership of the session, let me know, this would be greatly appreciated.

Last but not least, starting September, companies will have the opportunity to sponsorize the aperos. Please reach out to me if you’re interested or know some companies which might.

Thanks, and see you soon!

The post Hacky Thursdays — Back to school Fall 2016 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
New infrastructure http://daplab.ch/2016/08/26/new-infrastructure/ Fri, 26 Aug 2016 06:58:17 +0000 http://daplab.ch/?p=215 We are glad to announce the update of our infrastructure. We added a whole new bunch of machines, enhancing the HADOOP cluster performances of the DAPLAB.   what old new total CPU cores 372 128 500 cores RAM 978 GB 1024 GB  2002 GB Storage 323 TB  192 TB  515 TB   In total, 8 machines with Intel […]

The post New infrastructure appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We are glad to announce the update of our infrastructure. We added a whole new bunch of machines, enhancing the HADOOP cluster performances of the DAPLAB.

 

what old new total
CPU cores 372 128 500 cores
RAM 978 GB 1024 GB  2002 GB
Storage 323 TB  192 TB  515 TB

 

In total, 8 machines with Intel Xeon E5-2630V3 2.40GHZ processors (8 cores) have been incorporated. A special thank you to Christophe Bovigny for the installation and the provisioning of the new machines. Great work !

The post New infrastructure appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Voldemort Knowledge Graph http://daplab.ch/2016/08/02/voldemort-knowledge-graph/ Tue, 02 Aug 2016 10:53:57 +0000 http://daplab.ch/?p=201 VoldemortKG is an exciting project which explores the extent to which entities represented in different ways repeat on the Web, how they are related, and how they complement (or link) to each other. It has released a paper as well as made available its first results on this website. DAPLAB has made available the resources needed to create the […]

The post Voldemort Knowledge Graph appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
VoldemortKG is an exciting project which explores the extent to which entities represented in different ways repeat on the Web, how they are related, and how they complement (or link) to each other. It has released a paper as well as made available its first results on this website.


DAPLAB has made available the resources needed to create the dataset as well as to perform some of the computations. A great thanks to our collaborators Benoit Perroud and Chistophe Bovigny for their work !

We hope this is just the beginning of fruitful collaborations with external researchers and companies and that our infrastructure will help more researchers in their ground-breaking works.


Taking advantage of the growing amount of structured data produced on the Web is critical for a number of tasks, from identifying tail entities to enriching existing knowledge bases with new properties as they emerge on the Web. While this information has been essentially exploited by commercial companies, it remains an under-explored ground for the research community where several fundamental research challenges arise.

VoldemortKG’s goal is to provide tools, knowledge and models to extract and match structured pieces of data with high confidence in addition to provenance data, which constitutes a playground for researchers interested in a number of tasks including entity disambiguation & linking, entity typing, ad-hoc object retrieval or provenance management.

The post Voldemort Knowledge Graph appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Big Data For SME – Presentation of DAPLAB at the 49es Journées romandes des arts et métiers http://daplab.ch/2016/07/06/presentation-daplab-49es-journees-romandes-des-arts-et-metiers-champery-27-et-28-juin-2016/ Wed, 06 Jul 2016 10:11:54 +0000 http://daplab.ch/?p=188 We presented “Big Data for SME with DAPLAB” in Champery at the 49th “Journée Romandes des Arts et Métiers”. About 100 people were attending at this conference about “Numérisation – Chances et Défis pour les PMEs” in Champery (CH), 27th and 28th of June 2016. More information here. DAPLAB actually interested many people who contacted us […]

The post Big Data For SME – Presentation of DAPLAB at the 49es Journées romandes des arts et métiers appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We presented “Big Data for SME with DAPLAB” in Champery at the 49th “Journée Romandes des Arts et Métiers”. About 100 people were attending at this conference about “Numérisation – Chances et Défis pour les PMEs” in Champery (CH), 27th and 28th of June 2016. More information here.

DAPLAB actually interested many people who contacted us after the presentation. DAPLAB is indeed offering a good opportunity for SMEs to enter the Big Data and Data Analytics arena.

Extract of what was said during the presentation and Q/A session (in French):

 … DAPLAB est un laboratoire ouvert à toutes les entreprises et chercheurs qui se posent la question du Big Data. Dans ce domaine, le ticket d’entrée est cher, il faut des machines et des experts. Les grands s’y mettent. Nous réduisons la barrière à l’entrée pour les PME. Nous les accompagnons dans l’exploration et la valorisation des données. Le Canton de Fribourg participe au financement …

Big Data For SME with DAPLAB - Jean Hennebert - presenter of DAPLAB at the Journée Romande des Arts et Métiers.
Big Data For SME with DAPLAB – Jean Hennebert – presenter of DAPLAB at the Journée Romande des Arts et Métiers.

The post Big Data For SME – Presentation of DAPLAB at the 49es Journées romandes des arts et métiers appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
A part of the infra moved! http://daplab.ch/2016/07/04/part-daplab-infra-moved/ Mon, 04 Jul 2016 07:38:00 +0000 http://daplab.ch/?p=179 Dear followers, we are happy to announce that a part of the DAPLAB infrastructure moved to another server room in the Engineering buildings. We are now occupying two server rooms in two separated buildings and this has some consequences – hopefully for the best. We are now safer by occupying two server rooms. In case […]

The post A part of the infra moved! appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Dear followers, we are happy to announce that a part of the DAPLAB infrastructure moved to another server room in the Engineering buildings. We are now occupying two server rooms in two separated buildings and this has some consequences – hopefully for the best.

  • We are now safer by occupying two server rooms. In case of fire or cooling defect, we are “room redundant”;
  • We can now acquire more capacity – we expect good news in the next few weeks for more storage and more computation power in DAPLAB;
  • The network link between the two legs of DAPLAB is not optimal yet (i.e. no fibers). We however fight for it. For now let’s see how Yarn handles this situation and let’s see if our server split was well-thought.

Keep tuned for more news soon.

The post A part of the infra moved! appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
HDP 2.4.2 on the DAPLAB http://daplab.ch/2016/07/01/hdp-2-4-2-daplab/ Fri, 01 Jul 2016 10:56:08 +0000 http://daplab.ch/?p=173 We are proud to announce that the DAPLAB upgraded to the latest HortonWorks release, HDP 2.4.2. The upgrade went perfectly well, thanks to the skills of our technical team. The official Apache versions of most HDP 2.4.2 components are unchanged from HDP 2.4.0.0, with the exception of Spark and Kafka. Spark is upgraded from 1.6.0 to […]

The post HDP 2.4.2 on the DAPLAB appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We are proud to announce that the DAPLAB upgraded to the latest HortonWorks release, HDP 2.4.2. The upgrade went perfectly well, thanks to the skills of our technical team.


The official Apache versions of most HDP 2.4.2 components are unchanged from HDP 2.4.0.0, with the exception of Spark and Kafka. Spark is upgraded from 1.6.0 to 1.6.1; Kafka is upgraded from 0.9.0 to 0.9.0.1.

Spark 1.6.1
  • ODBC/JDBC support for SparkSQL
  • Support for Spark Streaming and Kafka in a Secure Cluster (Kerberos-enabled)
  • Oozie Token support for Spark jobs
 Kafka 0.9.0.1
  • Support for MirrorMaker with a secure cluster (Kerberos-enabled)

More info: Hortonworks HDP 2.4.2 release notes

 

The post HDP 2.4.2 on the DAPLAB appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
DAPLAB is featured on the Swiss TV http://daplab.ch/2016/03/11/daplab-is-featured-on-the-swiss-tv/ Fri, 11 Mar 2016 08:59:21 +0000 http://daplab.ch/?p=109 On Saturday 5th of March, DAPLAB and its partners were featured on the Swiss television news (in french). We were able to introduce our activities and illustrate them with partner case studies of Infoteam and Swisscom.

The post DAPLAB is featured on the Swiss TV appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
On Saturday 5th of March, DAPLAB and its partners were featured on the Swiss television news (in french). We were able to introduce our activities and illustrate them with partner case studies of Infoteam and Swisscom.

The post DAPLAB is featured on the Swiss TV appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>