Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> HDFS Rsync process??


Copy link to this message
-
Re: HDFS Rsync process??
On 11/30/2010 03:51 AM, Steve Loughran wrote:
> On 30/11/10 03:59, hadoopman wrote:
>
>
> you don't need all the files in the cluster in sync as a lot of them
> are intermediate and transient files.
>
> Instead use dfscopy to copy source files to the two clusters, this
> runs across the machines in the cluster and is also designed to work
> across hadoop versions, with some limitations.
>
>
>

Page 70 in the Oreilly Hadoop book talks about distcp to copy data
across two hdfs clusters.  I'm curious if something like that would also
work?  Would I just be able to call both namenode1 from both clusters
when initiating the copy?  Still playing with it.  Figured I should ask  :-)

Thanks
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB