Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Combine data from different HDFS FS


Copy link to this message
-
Re: Combine data from different HDFS FS
Harsh J 2013-04-08, 17:34
You should be able to add fully qualified HDFS paths from N clusters
into the same job via FileInputFormat.addInputPath(…) calls. Caveats
may apply for secure environments, but for non-secure mode this should
work just fine. Did you try this and did it not work?

On Mon, Apr 8, 2013 at 9:56 PM, Pedro Sá da Costa <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I want to combine the data that are in different HDFS filesystems, for them
> to be executed in one job. Is it possible to do this with MR, or there is
> another Apache tool that allows me to do this?
>
> Eg.
>
> Hdfs data in Cluster1 ----v
> Hdfs data in Cluster2 -> this job reads the data from Cluster1, 2
>
>
> Thanks,
> --
> Best regards,

--
Harsh J