Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> running Combiner for all the task on the node.


+
Suresh S 2013-01-02, 12:23
+
Harsh J 2013-01-02, 17:48
Copy link to this message
-
Re: running Combiner for all the task on the node.
Hi Suresh,

               The combiner function will aggregate the data from a single
map instance. But NOT for all the maps running in a given node.
               AFAIK As the maps will be running in the individual child
JVMs, still the intermediate data need to be serialized (moved) so that
your combiner can aggregate the data at Node level.

Best,
Mahesh Balija,
Calsoft Labs.
On Wed, Jan 2, 2013 at 5:53 PM, Suresh S <[EMAIL PROTECTED]> wrote:

> Hello,
>
>       I think, running combiner function at node level (to combine all the
> map task output of the node) may reduce the intermediate data movement.
>
>      I don't know this technique is already available or not. Is it worth
> for working in this direction?
> Any suggestions? Thanks in advance.
> *Regards*
> *S.Suresh,*
> *Research Scholar,*
> *Department of Computer Applications,*
> *National Institute of Technology,*
> *Tiruchirappalli - 620015.*
> *+91-9941506562*
>
+
Mahesh Balija 2013-01-02, 12:50
+
Suresh S 2013-01-02, 13:04
+
Mahesh Balija 2013-01-02, 14:06
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB