Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Avro Map-Reduce and ChainMapper


Copy link to this message
-
Re: Avro Map-Reduce and ChainMapper
You might also want to take a look at
https://github.com/cloudera/crunch/

Not sure what state its in but judging by file names it might support
flume.

J

On Wed, Feb 8, 2012 at 10:04 AM, Scott Carey <[EMAIL PROTECTED]> wrote:

> I have not tried or tested ChainMapper with Avro myself.  It will probably
> work if you configure the input schemas or output schemas appropriately.
>  Take a look at what AvroJog.setInputSchema is doing, if you are familiar
> enough with hadoop's configuration you may be able to work it out.  Others
> likely know more than I do on this.
>
> Also, you may be interested in how things are done in this variation:
> https://github.com/wibidata/odiago-avro
>
>
> On 2/1/12 8:23 AM, "Andrew Kenworthy" <[EMAIL PROTECTED]> wrote:
>
> Hallo,
>
> Is it possible to chain Avro MR jobs using the ChainMapper? I'm looking
> to chain two map tasks and a reducer, but haven't been able to find any
> examples:
>
> Chain summary:
> a) first map task: takes non-avro input and produces K/V output in the
> form of AvroKey(Record), NullWritable
> b) second map task: taking output of first task as its input [mapper extends
> AvroMapper(Record, Pair(Record, NullWritable))]
> c) reducer: AvroReducer
>
> In particular, how would I specify the input and output schemas - simply
> calling AvroJob.setInputSchema/setOutputSchema on the individual chained
> job conf objects?
>
> Thanks,
>
> Andrew
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB