Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - copy data (distcp) from local cluster to the EC2 cluster


Copy link to this message
-
Re: copy data (distcp) from local cluster to the EC2 cluster
Aaron Kimball 2009-09-11, 00:31
That's 99% correct. If you want/need to run different versions of HDFS on
the two different clusters, then you can't use hdfs:// protocol to access
both of them in the same command. In this case, use hdfs://bla/ for the
source fs and *hftp*://bla2/ for the dest fs.

- Aaron

On Tue, Sep 8, 2009 at 12:45 AM, Anthony Urso <[EMAIL PROTECTED]>wrote:

> Yes, just run something along the lines of:
>
> hadoop distcp hdfs://local-namenode/path hdfs://ec2-namenode/path
>
> on the job tracker of a MapReduce cluster.
>
> Make sure that your EC2 security group setup allows HDFS access from
> the local HDFS cluster and wherever you run MapReduce job from.  Also,
> I believe both HDFS setups still need to be running on the same
> version of Hadoop.
>
> More here:
>
> http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
>
> Cheers,
> Anthony
>
> On Mon, Sep 7, 2009 at 10:37 PM, stchu<[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > Does Distcp support to copy data from my local cluster (1 master+3
> slaves,
> > fs=hdfs) to the EC2 cluster (1master+2slaves, fs=hdfs)?
> > If it's supported, how can I do? I appreciate for any guide or
> suggestion.
> >
> > stchu
> >
>