Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Re: manipulating key in combine phase


Copy link to this message
-
Re: manipulating key in combine phase
Amit Sela 2014-01-13, 17:39
More than a solution, I'd like to know if a combiner is allowed to change
the key ? will it interfere with the mappers sort/merge ?
On Mon, Jan 13, 2014 at 3:06 PM, Devin Suiter RDX <[EMAIL PROTECTED]> wrote:

> Amit,
>
> Have you explored chainMapper class?
>
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com
>
>
> On Sun, Jan 12, 2014 at 7:28 PM, John Lilley <[EMAIL PROTECTED]>wrote:
>
>>  Isn’t this is what you’d normally do in the Mapper?
>>
>> My understanding of the combiner is that it is like a “mapper-side
>> pre-reducer” and operates on blocks of data that have already been sorted
>> by key, so mucking with the keys doesn’t **seem** like a good idea.
>>
>> john
>>
>>
>>
>> *From:* Amit Sela [mailto:[EMAIL PROTECTED]]
>> *Sent:* Sunday, January 12, 2014 9:26 AM
>> *To:* [EMAIL PROTECTED]
>> *Subject:* manipulating key in combine phase
>>
>>
>>
>> Hi all,
>>
>>
>>
>> I was wondering if it is possible to manipulate the key during combine:
>>
>>
>>
>> Say I have a mapreduce job where the key has many qualifiers.
>>
>> I would like to "split" the key into two (or more) keys if it has more
>> than, say 100 qualifiers.
>>
>> In the combiner class I would do something like:
>>
>>
>>
>> int count = 0;
>>
>> for (Writable value: values) {
>>
>>   if (++count >= 100){
>>
>>     context.write(newKey, value);
>>
>>   } else {
>>
>>     context.write(key, value);
>>
>>   }
>>
>> }
>>
>>
>>
>> where newKey is something like key+randomUUID
>>
>>
>>
>> I know that the combiner can be called "zero, once or more..." and I'm
>> getting strange results (same key written more then once) so I would be
>> glad to get some deeper insight into how the combiner works.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> Amit.
>>
>
>