Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MultipleInputs.addInputPath


Copy link to this message
-
MultipleInputs.addInputPath
Hi,

  So, I have two different directories.. which i want to process
differently...
For which I have to mappers for the job..

Data1
Data2

and in my driver.. I add the following:
MultipleInputs.addInputPath(job, new Path( args[0]),
     TextInputFormat.class,
     Data1.class);
    MultipleInputs.addInputPath(job, new Path(args[1]),
     TextInputFormat.class,
     Data2.class);
But what I now want is to just select two files from it..

So.. usually this is how we would do this
FileInputFormat.addInputPaths(job,"Data1/part-00000,Data1/part-00000");

But.. how do i specify specific files in MultiInputs object.

Basically.. two mappers.. processing two different inputs... but I want to
specify which files in thsoe two directories to read for processing by
mappers.?
How do i do this in hadoop?