Spark MLlib Hello World

This page aims at creating a “copy-paste”-like tutorial to run your first Spark MLlib script. Requirements SSH (for Windows, use PuTTY and see how to create a key with PuTTY) An account in the DAPLAB, and send your ssh public key to Benoit. A browser — well, if you can access this page, you should […]

Spark Hello World

A new tutorial is available on docs.daplab.ch. It will guide you through the basics of Apache Spark and its scala interpreter (spark-shell). Enjoy !

A new framework to simplify interaction with YARN: Apache Twill

YARN, aka NextGen MapReduce, is awesome for building fault-tolerant distributed applications. But writing plain YARN application is far than trivial and might even be a show-stopper to lots of engineers. The good news is that a framework to simplify interaction with YARN emerged and met the Apache foundation: Apache Twill. While still in the incubation phase, the project looks […]

HDFS Hello World

This page aims at creating a “copy-paste”-like tutorial to familiarize with HDFS commands . It mainly focuses on user commands (uploading and downloading data into HDFS). Requirements SSH (for Windows, use PuTTY and see how to create a key with PuTTY) An account in the DAPLAB, and send your ssh public key to Benoit. A browser — well, […]

Available dataset : homogeneous meteorological data

We give access to homogeneous monthly values of temperature and precipitation for 14 stations from 1864 until today. Yearly values are averaged for whole Switzerland Since 1864 and are now on the DAPLAB ! Data set Explanation The file is a .txt and contains a four rows headers. MeteoSchweiz / MeteoSuisse / MeteoSvizzera / MeteoSwiss […]