Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Find reducer for a key


Copy link to this message
-
Re: Find reducer for a key
Hi Hemanth,

thanks for your reply.
Yes, this partially answered to my question. I know how hash
partitioner works and I guessed something similar.
The piece that I missed was that mapred.task.partition returns the
partition number of the reducer.
So, putting al the pieces together I undersand that: for each key in
the file I have to call the HashPartitioner.
Then I have to compare the returned index with the one retrieved by
Configuration.getInt("mapred.task.partition").
If it is equal then such a key will be served by that reducer. Is this correct?
To answer to your question:
In a reduce side of a MR job, I want to load from file some data in a
in-memory structure. Actually, I don't need to store the whole file
for each reducer, but only the lines that are related to such keys a
particular reducers will receive.
So, my intention is to know the keys in the setup method to store only
the needed lines.

Thanks,
Alberto
On 28 March 2013 11:01, Hemanth Yamijala <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Not sure if I am answering your question, but this is the background. Every
> MapReduce job has a partitioner associated to it. The default partitioner is
> a HashPartitioner. You can as a user write your own partitioner as well and
> plug it into the job. The partitioner is responsible for splitting the map
> outputs key space among the reducers.
>
> So, to know which reducer a key will go to, it is basically the value
> returned by the partitioner's getPartition method. For e.g this is the code
> in the HashPartitioner:
>
>   public int getPartition(K2 key, V2 value,
>                           int numReduceTasks) {
>     return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
>   }
>
> mapred.task.partition is the key that defines the partition number of this
> reducer.
>
> I guess you can piece together these bits into what you'd want.. However, I
> am interested in understanding why you want to know this ? Can you share
> some info ?
>
> Thanks
> Hemanth
>
>
> On Thu, Mar 28, 2013 at 2:17 PM, Alberto Cordioli
> <[EMAIL PROTECTED]> wrote:
>>
>> Hi everyone,
>>
>> how can i know the keys that are associated to a particular reducer in
>> the setup method?
>> Let's assume in the setup method to read from a file where each line
>> is a string that will become a key emitted from mappers.
>> For each of these lines I would like to know if the string will be a
>> key associated with the current reducer or not.
>>
>> I read something about mapred.task.partition and mapred.task.id, but I
>> didn't understand the usage.
>>
>>
>> Thanks,
>> Alberto
>>
>>
>> --
>> Alberto Cordioli
>
>

--
Alberto Cordioli
+
Alberto Cordioli 2013-03-30, 13:28