Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Uploading file to HDFS


Copy link to this message
-
RE: Uploading file to HDFS
I think the problem here is that he doesn't have Hadoop installed on this
other location so there's no Hadoop DFS client to do the put directly into
HDFS on, he would normally copy the file to one of the nodes in the cluster
where the client files are installed. I've had the same problem recently.

I've tried setting up dfs-hdfs-proxy, though I must say that it's been
crashing when I try to put modest to large files through it (but I've got a
thread going with the developer on that issue). That, or one of the other
remote mount options might work well.

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved0CDQQFjAA&url=http%3A%2F%2Fwiki.apache.org%2Fhadoop%2FMountableHDFS&ei=T-pwU
Y74A8jPrQfYooHoBw&usg=AFQjCNEQbxmrMGKAETj3FPEw3Lr1PBHz-w&sig2=4JpEzZ_8IAyJ-N
PofSRmMg&bvm=bv.45373924,d.bmk

You could also install Hadoop on the box that has the 2TB file (I realize
that you might not control it or want to do that depending on the
configuration).

A remote NFS mount that you can access from one of the Hadoop boxes... ?

Split up the file into smaller pieces?

There are some ideas. I'd love to hear your final solution as I've also been
having fits getting into HDFS from outside the Hadoop environment.  I wish
it natively supported NFS mounts or some light weight/easy to install remote
DFS tools.

Dave

-----Original Message-----
From: Harsh J [mailto:[EMAIL PROTECTED]]
Sent: Friday, April 19, 2013 1:40 PM
To: <[EMAIL PROTECTED]>
Subject: Re: Uploading file to HDFS

Can you not simply do a fs -put from the location where the 2 TB file
currently resides? HDFS should be able to consume it just fine, as the
client chunks them into fixed size blocks.

On Fri, Apr 19, 2013 at 10:05 AM, 超级塞亚人 <[EMAIL PROTECTED]> wrote:
> I have a problem. Our cluster has 32 nodes. Each disk is 1TB. I wanna
> upload 2TB file to HDFS.How can I put the file to the namenode and upload
to HDFS?

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB