Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MultipleInputs.addInputPath


Copy link to this message
-
MultipleInputs.addInputPath
Hi,

  So, I have two different directories.. which i want to process
differently...
For which I have to mappers for the job..

Data1
Data2

and in my driver.. I add the following:
MultipleInputs.addInputPath(job, new Path( args[0]),
     TextInputFormat.class,
     Data1.class);
    MultipleInputs.addInputPath(job, new Path(args[1]),
     TextInputFormat.class,
     Data2.class);
But what I now want is to just select two files from it..

So.. usually this is how we would do this
FileInputFormat.addInputPaths(job,"Data1/part-00000,Data1/part-00000");

But.. how do i specify specific files in MultiInputs object.

Basically.. two mappers.. processing two different inputs... but I want to
specify which files in thsoe two directories to read for processing by
mappers.?
How do i do this in hadoop?
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB