Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Uploading a file to HDFS


Copy link to this message
-
Uploading a file to HDFS
Hi,

I have a couple of questions about the process of uploading a large file (>
10GB) to HDFS.

To make sure my understanding is correct, assuming I have a cluster of N
machines.
   - What happens in the following:
Case 1:
                assuming i want to uppload a file (input.txt) of size K GBs
that resides on the local disk of machine 1 (which happens to be the
namenode only). if I am running the command  -put input.txt {some hdfs dir}
from the namenode (assuming it does not play the datanode role), then will
the namenode read the first 64MB in a temporary pipe and then transfers it
to one of the cluster datanodes once finished?  Or the namenode does not do
any reading of the file, but rather asks a certain datanode to read the
64MB window from the file remotely?
Case 2:
             assume machine 1 is the namenode, but i run the -put command
from machine 3 (which is a datanode). who will start reading the file?

--
Best Regards,
Karim Ahmed Awara

--

------------------------------
This message and its contents, including attachments are intended solely
for the original recipient. If you are not the intended recipient or have
received this message in error, please notify me immediately and delete
this message from your computer system. Any unauthorized use or
distribution is prohibited. Please consider the environment before printing
this email.