Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Mapper Getting the whole split and not just line by line

Copy link to this message
Re: Mapper Getting the whole split and not just line by line
Thanks for the fast reply!
I've dug in the code a little bit, and it seems to me that I can achieve my
goal by overloading Mapper.run method: just iterate over the whole split by
using context.nextKeyValue() and then call map only with the values I need.
Since I'm a novice Hadooper, am I thinking it the wrong way?

thanks again,

On Wed, Oct 12, 2011 at 12:44 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hello Yaron,
> Yes, this is possible to do.
> You need to plug in your own RecordReader implementation into the job,
> to control the emits and the action done before feeding key-value pair
> data into map(…).
> On Wed, Oct 12, 2011 at 2:42 PM, Yaron Gonen <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> > The map method in the Mapper gets as a parameter a single line from the
> > split. Is there a way for Mappers to get the whole split as input?
> > I'd like to scan the whole split before I decide which key-value pairs to
> > emit to the reducer.
> > Thanks
> > yaron
> >
> --
> Harsh J