Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro, mail # user - Multiple Input for Avro jobs


+
Serge Blazhievsky 2012-02-08, 20:24
Copy link to this message
-
Re: Multiple Input for Avro jobs
Scott Carey 2012-02-08, 21:26
If you are after only multiple paths, path globs work.
For example to read both /logs/2012_01  and /logs/2012_02 use the glob path:
/logs/2012_0{1,2}

And to read the four paths /logs/2011_01, /logs/2011_02/, logs/2012_01,  and
/logs/2012_02
/logs/201{1,2}_0{1,2}

'*' is a wildcard as well, e.g. /logs/2011_*/
If you  need a mapper instance per directory or different split assignment
there would be more work involved.

On 2/8/12 12:24 PM, "Serge Blazhievsky" <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I am trying to assign different mapper to different folders.
>
> Is there an equivalent of Multiinputs for avro
>
>
>   MultipleInputs.addInputPath(job, new Path(input),
> AvroInputFormat<GenericRecord>.class, MapImpl.class);
>
>
> Thanks
> Serge
>      
+
Serge Blazhievsky 2012-02-08, 22:45
+
Scott Carey 2012-02-08, 22:53
+
Serge Blazhievsky 2012-02-08, 23:01