{"id":47,"date":"2015-09-04T09:28:06","date_gmt":"2015-09-04T09:28:06","guid":{"rendered":"http:\/\/daplab.ch\/?p=47"},"modified":"2015-10-07T12:27:23","modified_gmt":"2015-10-07T12:27:23","slug":"hdfs-hello-world","status":"publish","type":"post","link":"https:\/\/daplab.ch\/2015\/09\/04\/hdfs-hello-world\/","title":{"rendered":"HDFS Hello World"},"content":{"rendered":"
This page aims at creating a “copy-paste”-like tutorial to familiarize with HDFS commands<\/a> . It mainly focuses on user commands (uploading and downloading data into HDFS).<\/p>\n While the source of truth for HDFS commands is the code source, the documentation page describing the\u00a0 This page aims at creating a “copy-paste”-like tutorial to familiarize with HDFS commands . It mainly focuses on user commands (uploading and downloading data into HDFS). Requirements SSH (for Windows, use PuTTY\u00a0and see\u00a0how to create a key with PuTTY) An account in the\u00a0DAPLAB, and send your ssh public key to Benoit. A browser — well, […]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[13],"tags":[7,6],"yoast_head":"\nRequirements<\/h1>\n
\n
Resources<\/h1>\n
hdfs dfs<\/code>\u00a0commands is really useful:<\/p>\n<\/div>\n
\n
Basic Manipulations<\/h1>\n
Listing a folder<\/h3>\n
Your home folder<\/h4>\n
$ hdfs dfs -ls\r\nFound 28 items\r\n...\r\n-rw-r--r-- 3 bperroud daplab_user 6398990 2015-03-13 11:01 data.csv\r\n...\r\n^^^^^^^^^^ ^ ^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^ ^^^^^^^^^^ ^^^^^ ^^^^^^^^\r\n 1 2 3 4 5 6 7 8\r\n<\/pre>\n
\n
Listing the \/tmp folder<\/h4>\n
$ hdfs dfs -ls \/tmp<\/pre>\n<\/div>\n
Uploading a file<\/h3>\n
In \/tmp<\/h4>\n
$ hdfs dfs -copyFromLocal localfile.txt \/tmp\/<\/pre>\n<\/div>\n
-copyFromLocal<\/code>\u00a0point to local files or folders, while the last argument is a file (if only one file listed as source) or directory in HDFS.<\/div>\n
hdfs dfs -put<\/code>\u00a0is doing about the same thing, but\u00a0
-copyFromLocal<\/code>\u00a0is more explicit when you’re uploading a local file and thus preferred.<\/div>\n<\/div>\n
Downloading a file<\/h3>\n
From \/tmp<\/h4>\n
$ hdfs dfs -copyToLocal \/tmp\/remotefile.txt .<\/pre>\n<\/div>\n
-copyToLocal<\/code>\u00a0point to files or folder in HDFS, while the last argument is a local file (if only one file listed as source) or directory.<\/div>\n
hdfs dfs -get<\/code>\u00a0is doing about the same thing, but\u00a0
-copyToLocal<\/code>\u00a0is more explicit when you’re downloading a file and thus preferred.<\/div>\n
Creating a folder<\/h3>\n
In your home folder<\/h4>\n<\/div>\n
$ hdfs dfs -mkdir dummy-folder<\/pre>\n
In \/tmp<\/h4>\n<\/div>\n
$ hdfs dfs -mkdir \/tmp\/dummy-folder<\/pre>\n
\/user\/bperroud<\/code>\u00a0for instance.<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"