-Re: copy data (distcp) from local cluster to the EC2 cluster
Aaron Kimball 2009-09-11, 00:31
That's 99% correct. If you want/need to run different versions of HDFS on
the two different clusters, then you can't use hdfs:// protocol to access
both of them in the same command. In this case, use hdfs://bla/ for the
source fs and *hftp*://bla2/ for the dest fs.
On Tue, Sep 8, 2009 at 12:45 AM, Anthony Urso <[EMAIL PROTECTED]>wrote:
> Yes, just run something along the lines of:
> hadoop distcp hdfs://local-namenode/path hdfs://ec2-namenode/path
> on the job tracker of a MapReduce cluster.
> Make sure that your EC2 security group setup allows HDFS access from
> the local HDFS cluster and wherever you run MapReduce job from. Also,
> I believe both HDFS setups still need to be running on the same
> version of Hadoop.
> More here:
> On Mon, Sep 7, 2009 at 10:37 PM, stchu<[EMAIL PROTECTED]> wrote:
> > Hi,
> > Does Distcp support to copy data from my local cluster (1 master+3
> > fs=hdfs) to the EC2 cluster (1master+2slaves, fs=hdfs)?
> > If it's supported, how can I do? I appreciate for any guide or
> > stchu