Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro Map-Reduce and ChainMapper


Copy link to this message
-
Re: Avro Map-Reduce and ChainMapper
I have not tried or tested ChainMapper with Avro myself.  It will probably
work if you configure the input schemas or output schemas appropriately.
Take a look at what AvroJog.setInputSchema is doing, if you are familiar
enough with hadoop's configuration you may be able to work it out.  Others
likely know more than I do on this.

Also, you may be interested in how things are done in this variation:
https://github.com/wibidata/odiago-avro
On 2/1/12 8:23 AM, "Andrew Kenworthy" <[EMAIL PROTECTED]> wrote:

> Hallo,
>
> Is it possible to chain Avro MR jobs using the ChainMapper? I'm looking to
> chain two map tasks and a reducer, but haven't been able to find any examples:
>
> Chain summary:
> a) first map task: takes non-avro input and produces K/V output in the form of
> AvroKey(Record), NullWritable
> b) second map task: taking output of first task as its input [mapper extends
> AvroMapper(Record, Pair(Record, NullWritable))]
> c) reducer: AvroReducer
>
> In particular, how would I specify the input and output schemas - simply
> calling AvroJob.setInputSchema/setOutputSchema on the individual chained job
> conf objects?
>
> Thanks,
>
> Andrew
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB