Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: Mapreduce outputs to a different cluster?


Copy link to this message
-
Re: Mapreduce outputs to a different cluster?
As far as I know, you can use distcp to transfer the results of the job
form one cluster to another, once the job is done. You can write a simple
script to do that. Simple and tested. Some poiners below:
http://doc.mapr.com/display/MapR/hadoop+distcp
https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-3/parallel-copying-with-distcp
http://hadoop.apache.org/docs/r1.2.1/distcp.html

You might be able to do this through the job as well byt changing the
output paths of the  generated files but I wouldn't suggest that there can
be latency and performance issues.

Maybe others have better idea....

Regards,
Shahab
On Thu, Oct 24, 2013 at 6:28 PM, S. Zhou <[EMAIL PROTECTED]> wrote:

> The scenario is: I run mapreduce job on cluster A (all source data is in
> cluster A) but I want the output of the job to cluster B. Is it possible?
> If yes, please let me know how to do it.
>
> Here are some notes of my mapreduce job:
> 1. the data source is an HBase table
> 2. It only has mapper no reducer.
>
> Thanks
> Senqiang
>
>