Thanks for the fast reply!
I've dug in the code a little bit, and it seems to me that I can achieve my
goal by overloading Mapper.run method: just iterate over the whole split by
using context.nextKeyValue() and then call map only with the values I need.
Since I'm a novice Hadooper, am I thinking it the wrong way?
On Wed, Oct 12, 2011 at 12:44 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Hello Yaron,
> Yes, this is possible to do.
> You need to plug in your own RecordReader implementation into the job,
> to control the emits and the action done before feeding key-value pair
> data into map(…).
> On Wed, Oct 12, 2011 at 2:42 PM, Yaron Gonen <[EMAIL PROTECTED]>
> > Hi,
> > The map method in the Mapper gets as a parameter a single line from the
> > split. Is there a way for Mappers to get the whole split as input?
> > I'd like to scan the whole split before I decide which key-value pairs to
> > emit to the reducer.
> > Thanks
> > yaron
> Harsh J