Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Multiple Input for Avro jobs


Copy link to this message
-
Re: Multiple Input for Avro jobs
If you are after only multiple paths, path globs work.
For example to read both /logs/2012_01  and /logs/2012_02 use the glob path:
/logs/2012_0{1,2}

And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,  and
/logs/2012_02
/logs/201{1,2}_0{1,2}

'*' is a wildcard as well, e.g. /logs/2011_*/
If you  need a mapper instance per directory or different split assignment
there would be more work involved.

On 2/8/12 12:24 PM, "Serge Blazhievsky" <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I am trying to assign different mapper to different folders.
>
> Is there an equivalent of Multiinputs for avro
>
>
>   MultipleInputs.addInputPath(job, new Path(input),
> AvroInputFormat<GenericRecord>.class, MapImpl.class);
>
>
> Thanks
> Serge
>      
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB