Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: Mapreduce outputs to a different cluster?


Copy link to this message
-
Re: Mapreduce outputs to a different cluster?
As far as I know, you can use distcp to transfer the results of the job
form one cluster to another, once the job is done. You can write a simple
script to do that. Simple and tested. Some poiners below:
http://doc.mapr.com/display/MapR/hadoop+distcp
https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-3/parallel-copying-with-distcp
http://hadoop.apache.org/docs/r1.2.1/distcp.html

You might be able to do this through the job as well byt changing the
output paths of the  generated files but I wouldn't suggest that there can
be latency and performance issues.

Maybe others have better idea....

Regards,
Shahab
On Thu, Oct 24, 2013 at 6:28 PM, S. Zhou <[EMAIL PROTECTED]> wrote:

> The scenario is: I run mapreduce job on cluster A (all source data is in
> cluster A) but I want the output of the job to cluster B. Is it possible?
> If yes, please let me know how to do it.
>
> Here are some notes of my mapreduce job:
> 1. the data source is an HBase table
> 2. It only has mapper no reducer.
>
> Thanks
> Senqiang
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB