Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - How to Influence Reduce Task Location.


Copy link to this message
-
Re: How to Influence Reduce Task Location.
Jane Chen 2010-12-19, 18:23
Suppose that the output is written to a database, that only runs on certain nodes.  It will be desirable to schedule the reducer tasks to run on the nodes local or close to the database nodes.
 
Thanks,
Jane

--- On Sat, 12/18/10, Hari Sreekumar <[EMAIL PROTECTED]> wrote:
From: Hari Sreekumar <[EMAIL PROTECTED]>
Subject: Re: How to Influence Reduce Task Location.
To: [EMAIL PROTECTED]
Date: Saturday, December 18, 2010, 10:35 AM
You can specify that a group of keys should go to the same host for reducing, but I have never encountered any situation where you need to know beforehand exactly which host a particular key should go to. I am not sure if that can be done. Just out of curiosity, why do you need this kind of control over reduction?
Hari
On Sat, Dec 18, 2010 at 11:54 PM, Jane Chen <[EMAIL PROTECTED]> wrote:
But how does this help me request which host to schedule the reduce task to?

Thanks,
Jane

--- On Sat, 12/18/10, Hari Sreekumar <[EMAIL PROTECTED]> wrote:
From: Hari Sreekumar <[EMAIL PROTECTED]>
Subject: Re: How to Influence Reduce Task Location.
To: [EMAIL PROTECTED]
Date: Saturday, December 18, 2010, 10:16 AM

Hi Jane,
         The partitioner class can be used to achieve this. (http://hadoop.apache.org/mapreduce/docs/r0.21.0/api/org/apache/hadoop/mapreduce/Partitioner.html).
Thanks,
Hari
On Sat, Dec 18, 2010 at 11:13 PM, Jane Chen <[EMAIL PROTECTED]> wrote:

Hi All,

Is there anyway to influence where a reduce task is run?  We have a case where we'd like to choose the host to run the reduce task based on the task's input key.

Any suggestion is greatly appreciated.

Thanks,
Jane