Lucy Linder – DAPLAB – Data Analysis and Processing Lab http://daplab.ch Reduces the entry barrier for companies to find value out of their data and ultimately turn into a data-driven company Fri, 04 Aug 2017 14:42:54 +0000 en-US hourly 1 https://wordpress.org/?v=5.6.10 http://daplab.ch/wp-content/uploads/2017/06/cropped-daplab-favicon-1-32x32.png Lucy Linder – DAPLAB – Data Analysis and Processing Lab http://daplab.ch 32 32 DAPLAB updated to HDP 2.6.1 http://daplab.ch/2017/08/04/293/ Fri, 04 Aug 2017 14:40:32 +0000 http://daplab.ch/?p=293 Dear users and friends, Thanks to a great team effort, the DAPLAB stack has been updated to the latest version. We are now running HDP 2.6.1 ! The following technologies have also been upgraded in the process: HDP 2.6.1 Ambari 2.5.2 Spark 1.6.2 Spark2 2.1.1 Kafka 0.10.1 Cassandra 3.11 Flink 1.4 We hope you will […]

The post DAPLAB updated to HDP 2.6.1 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Dear users and friends,

Thanks to a great team effort, the DAPLAB stack has been updated to the latest version. We are now running HDP 2.6.1 !

The following technologies have also been upgraded in the process:

HDP 2.6.1
Ambari 2.5.2
Spark 1.6.2
Spark2 2.1.1
Kafka 0.10.1
Cassandra 3.11
Flink 1.4

We hope you will enjoy the new features that are now available,

happy coding !

The post DAPLAB updated to HDP 2.6.1 appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
New infrastructure http://daplab.ch/2016/08/26/new-infrastructure/ Fri, 26 Aug 2016 06:58:17 +0000 http://daplab.ch/?p=215 We are glad to announce the update of our infrastructure. We added a whole new bunch of machines, enhancing the HADOOP cluster performances of the DAPLAB.   what old new total CPU cores 372 128 500 cores RAM 978 GB 1024 GB  2002 GB Storage 323 TB  192 TB  515 TB   In total, 8 machines with Intel […]

The post New infrastructure appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We are glad to announce the update of our infrastructure. We added a whole new bunch of machines, enhancing the HADOOP cluster performances of the DAPLAB.

 

what old new total
CPU cores 372 128 500 cores
RAM 978 GB 1024 GB  2002 GB
Storage 323 TB  192 TB  515 TB

 

In total, 8 machines with Intel Xeon E5-2630V3 2.40GHZ processors (8 cores) have been incorporated. A special thank you to Christophe Bovigny for the installation and the provisioning of the new machines. Great work !

The post New infrastructure appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
Voldemort Knowledge Graph http://daplab.ch/2016/08/02/voldemort-knowledge-graph/ Tue, 02 Aug 2016 10:53:57 +0000 http://daplab.ch/?p=201 VoldemortKG is an exciting project which explores the extent to which entities represented in different ways repeat on the Web, how they are related, and how they complement (or link) to each other. It has released a paper as well as made available its first results on this website. DAPLAB has made available the resources needed to create the […]

The post Voldemort Knowledge Graph appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
VoldemortKG is an exciting project which explores the extent to which entities represented in different ways repeat on the Web, how they are related, and how they complement (or link) to each other. It has released a paper as well as made available its first results on this website.


DAPLAB has made available the resources needed to create the dataset as well as to perform some of the computations. A great thanks to our collaborators Benoit Perroud and Chistophe Bovigny for their work !

We hope this is just the beginning of fruitful collaborations with external researchers and companies and that our infrastructure will help more researchers in their ground-breaking works.


Taking advantage of the growing amount of structured data produced on the Web is critical for a number of tasks, from identifying tail entities to enriching existing knowledge bases with new properties as they emerge on the Web. While this information has been essentially exploited by commercial companies, it remains an under-explored ground for the research community where several fundamental research challenges arise.

VoldemortKG’s goal is to provide tools, knowledge and models to extract and match structured pieces of data with high confidence in addition to provenance data, which constitutes a playground for researchers interested in a number of tasks including entity disambiguation & linking, entity typing, ad-hoc object retrieval or provenance management.

The post Voldemort Knowledge Graph appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
HDP 2.4.2 on the DAPLAB http://daplab.ch/2016/07/01/hdp-2-4-2-daplab/ Fri, 01 Jul 2016 10:56:08 +0000 http://daplab.ch/?p=173 We are proud to announce that the DAPLAB upgraded to the latest HortonWorks release, HDP 2.4.2. The upgrade went perfectly well, thanks to the skills of our technical team. The official Apache versions of most HDP 2.4.2 components are unchanged from HDP 2.4.0.0, with the exception of Spark and Kafka. Spark is upgraded from 1.6.0 to […]

The post HDP 2.4.2 on the DAPLAB appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>
We are proud to announce that the DAPLAB upgraded to the latest HortonWorks release, HDP 2.4.2. The upgrade went perfectly well, thanks to the skills of our technical team.


The official Apache versions of most HDP 2.4.2 components are unchanged from HDP 2.4.0.0, with the exception of Spark and Kafka. Spark is upgraded from 1.6.0 to 1.6.1; Kafka is upgraded from 0.9.0 to 0.9.0.1.

Spark 1.6.1
  • ODBC/JDBC support for SparkSQL
  • Support for Spark Streaming and Kafka in a Secure Cluster (Kerberos-enabled)
  • Oozie Token support for Spark jobs
 Kafka 0.9.0.1
  • Support for MirrorMaker with a secure cluster (Kerberos-enabled)

More info: Hortonworks HDP 2.4.2 release notes

 

The post HDP 2.4.2 on the DAPLAB appeared first on DAPLAB - Data Analysis and Processing Lab.

]]>