-Re: Avro-mapred and new Java MapReduce API (org.apache.hadoop.mapreduce)
Scott Carey 2011-11-13, 20:23
I have heard some suggestions that it would be useful if we could somehow
model Avro's interaction with mapreduce using composition rather than
inheritance. Has anyone tried that? Or would it be too clumsy? A good
relationship with the mapreduce/mapred api via composition might require
changes on the hadoop side however.
On 11/13/11 5:04 AM, "Friso van Vollenhoven" <[EMAIL PROTECTED]>
> I use my own set of classes for this. I mostly copied from / modeled after the
> Avro mapred support for the old API.
> My approach is slightly different, though. The existing MR support fully
> abstracts / wraps away the Hadoop MR API and only exposes the Avro one. The
> only Hadoop API that the Avro classes see is the Configuration object. Problem
> is that in the new API, the Configuration object is kept within a context
> instance and you'd need to wrap the whole context thing and give the wrapper
> to the Avro mapper and reducer. This felt a bit overkill so I chose to just
> make mapper and reducer subclasses that handle the Avro work and then call a
> protected method to do the actual mapping or reducing. Problem is that you
> lose the property of a bare mapper or reducer being the identity function, but
> you could reintroduce this in a generic way, I think. I just don't use the
> identity functions a lot in practice, so I didn't bother.
> I pushed the code here: https://github.com/friso/avro-mapreduce. There is a
> unit test with some usage examples.
> On 11 nov. 2011, at 20:43, Doug Cutting wrote:
>> On 11/10/2011 12:38 AM, Andrew Kenworthy wrote:
>>> Are there plans to extend it to work with org.apache.hadoop.mapreduce as
>> There's an issue in Jira for this:
>> I don't know of anyone actively working on this at present. It would be
>> a great addition to Avro and I am hopeful someone will resume work on it