Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - reducer out of memory?


Copy link to this message
-
Re: reducer out of memory?
Yang 2012-05-10, 18:49
thanks, let me try this
On Wed, May 9, 2012 at 11:27 PM, Zizon Qiu <[EMAIL PROTECTED]> wrote:
> try setting a lower value for mapred.job.shuffle.input.buffer.percent .
> the reducer used it to decide whether use in-memory shuffle.
> the default value is 0.7,meaning 70% of the "memory" are used as shuffle
> buffer.
>
> On Thu, May 10, 2012 at 2:50 AM, Yang <[EMAIL PROTECTED]> wrote:
>
>> it seems that if I put too many records into the same mapper output
>> key, all these records are grouped into one key one one reducer,
>>
>> then the reducer became out of memory.
>>
>>
>> but the reducer interface is:
>>
>>       public void reduce(K key, Iterator<V> values,
>>                          OutputCollector<K, V> output,
>>                          Reporter reporter)
>>
>>
>> so  all the values belonging to the key can be iterated, so
>> theoretically they can be iterated from disk, and does not have to be
>> in memory at the same time,
>> so why am I getting out of heap error? is there some param I could
>> tune (apart from -Xmx since my box is ultimately bounded in memory
>> capacity)
>>
>> thanks
>> Yang
>>