Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> copy data (distcp) from local cluster to the EC2 cluster


Copy link to this message
-
Re: copy data (distcp) from local cluster to the EC2 cluster
That's 99% correct. If you want/need to run different versions of HDFS on
the two different clusters, then you can't use hdfs:// protocol to access
both of them in the same command. In this case, use hdfs://bla/ for the
source fs and *hftp*://bla2/ for the dest fs.

- Aaron

On Tue, Sep 8, 2009 at 12:45 AM, Anthony Urso <[EMAIL PROTECTED]>wrote:

> Yes, just run something along the lines of:
>
> hadoop distcp hdfs://local-namenode/path hdfs://ec2-namenode/path
>
> on the job tracker of a MapReduce cluster.
>
> Make sure that your EC2 security group setup allows HDFS access from
> the local HDFS cluster and wherever you run MapReduce job from.  Also,
> I believe both HDFS setups still need to be running on the same
> version of Hadoop.
>
> More here:
>
> http://hadoop.apache.org/common/docs/r0.20.0/distcp.html
>
> Cheers,
> Anthony
>
> On Mon, Sep 7, 2009 at 10:37 PM, stchu<[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > Does Distcp support to copy data from my local cluster (1 master+3
> slaves,
> > fs=hdfs) to the EC2 cluster (1master+2slaves, fs=hdfs)?
> > If it's supported, how can I do? I appreciate for any guide or
> suggestion.
> >
> > stchu
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB