-Re: Simplifying MapReduce API
Mohammad Tariq 2013-08-29, 11:47
Just to add to the above comments, you just have to extend the classes *
Mapper* and *Reducer* as per the new API.
On Wed, Aug 28, 2013 at 1:26 AM, Don Nelson <[EMAIL PROTECTED]> wrote:
> I agree with @Shahab - it's simple enough to declare both interfaces in
> one class if that's what you want to do. But given the distributed
> behavior of Hadoop, it's likely that your mappers will be running on
> different nodes than your reducers anyway - why ship around duplicate code?
> On Tue, Aug 27, 2013 at 9:48 AM, Shahab Yunus <[EMAIL PROTECTED]>wrote:
>> For starters (experts might have more complex reasons), what if your
>> respective map and reduce logic becomes complex enough to demand separate
>> classes? Why tie the clients to implement both by moving these in one Job
>> interface. In the current design you can always implement both (map and
>> reduce) interfaces if your logic is simple enough and go the other route,
>> of separate classes if that is required. I think it is more flexible this
>> way (you can always build up from and on top of granular design, rather
>> than other way around.)
>> I hope I understood your concern correctly...
>> On Tue, Aug 27, 2013 at 11:35 AM, Andrew Pennebaker <
>> [EMAIL PROTECTED]> wrote:
>>> There seems to be an abundance of boilerplate patterns in MapReduce:
>>> * Write a class extending Map (1), implementing Mapper (2), with a map
>>> method (3)
>>> * Write a class extending Reduce (4), implementing Reducer (5), with a
>>> reduce method (6)
>>> Could we achieve the same behavior with a single Job interface requiring
>>> map() and reduce() methods?
> "A child of five could understand this. Fetch me a child of five."