Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> retain state between mappers


Copy link to this message
-
Re: retain state between mappers
Remember that mappers are not executed in a well defined order.

They can be executed in different order or even at the same time.  One
mapper can be run more than once.

There are two ways to get something like what you want, but the question you
asked is ill-posed.

First, you can adapt the input format so that it gives a different integer
to each split as a key.  This doesn't something like what you ask for since
each mapper will get a different integer and multiply executed mappers will
get the same key each time they are run.

Secondly, you could use central coordination server to act as a global
counter.  **THIS IS A REALLY BAD IDEA**  It is bad because it turns a
parallel computation into a partially sequential one and because it doesn't
account for the fact that mappers can be run multiple times.

On Sat, Feb 5, 2011 at 4:03 AM, ANKITBHATNAGAR <[EMAIL PROTECTED]>wrote:

> is there a way I can retain the count between mappers and increment.?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB