Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Mapper Getting the whole split and not just line by line


Copy link to this message
-
Re: Mapper Getting the whole split and not just line by line
Thanks for the fast reply!
I've dug in the code a little bit, and it seems to me that I can achieve my
goal by overloading Mapper.run method: just iterate over the whole split by
using context.nextKeyValue() and then call map only with the values I need.
Since I'm a novice Hadooper, am I thinking it the wrong way?

thanks again,
yaron

On Wed, Oct 12, 2011 at 12:44 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hello Yaron,
>
> Yes, this is possible to do.
>
> You need to plug in your own RecordReader implementation into the job,
> to control the emits and the action done before feeding key-value pair
> data into map(…).
>
> On Wed, Oct 12, 2011 at 2:42 PM, Yaron Gonen <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> > The map method in the Mapper gets as a parameter a single line from the
> > split. Is there a way for Mappers to get the whole split as input?
> > I'd like to scan the whole split before I decide which key-value pairs to
> > emit to the reducer.
> > Thanks
> > yaron
> >
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB