Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> How to Influence Reduce Task Location.


Copy link to this message
-
Re: How to Influence Reduce Task Location.
Suppose that the output is written to a database, that only runs on certain nodes.  It will be desirable to schedule the reducer tasks to run on the nodes local or close to the database nodes.
 
Thanks,
Jane

--- On Sat, 12/18/10, Hari Sreekumar <[EMAIL PROTECTED]> wrote:
From: Hari Sreekumar <[EMAIL PROTECTED]>
Subject: Re: How to Influence Reduce Task Location.
To: [EMAIL PROTECTED]
Date: Saturday, December 18, 2010, 10:35 AM
You can specify that a group of keys should go to the same host for reducing, but I have never encountered any situation where you need to know beforehand exactly which host a particular key should go to. I am not sure if that can be done. Just out of curiosity, why do you need this kind of control over reduction?
Hari
On Sat, Dec 18, 2010 at 11:54 PM, Jane Chen <[EMAIL PROTECTED]> wrote:
But how does this help me request which host to schedule the reduce task to?

Thanks,
Jane

--- On Sat, 12/18/10, Hari Sreekumar <[EMAIL PROTECTED]> wrote:
From: Hari Sreekumar <[EMAIL PROTECTED]>
Subject: Re: How to Influence Reduce Task Location.
To: [EMAIL PROTECTED]
Date: Saturday, December 18, 2010, 10:16 AM

Hi Jane,
         The partitioner class can be used to achieve this. (http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/Partitioner.html).
Thanks,
Hari
On Sat, Dec 18, 2010 at 11:13 PM, Jane Chen <[EMAIL PROTECTED]> wrote:

Hi All,

Is there anyway to influence where a reduce task is run?  We have a case where we'd like to choose the host to run the reduce task based on the task's input key.

Any suggestion is greatly appreciated.

Thanks,
Jane

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB