Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> OutOfMemoryError during reduce shuffle


Copy link to this message
-
Re: OutOfMemoryError during reduce shuffle
There are a few tweaks In configuration that may help. Can you please look
at
http://hadoop.apache.org/docs/r1.0.4/mapred_tutorial.html#Shuffle%2FReduce+Parameters

Also, since you have mentioned reducers are unbalanced, could you use a
custom partitioner to balance out the outputs. Or just increase the number
of reducers so the load is spread out.

Thanks
Hemanth

On Wednesday, February 20, 2013, Shivaram Lingamneni wrote:

> I'm experiencing the following crash during reduce tasks:
>
> https://gist.github.com/slingamn/04ff3ff3412af23aa50d
>
> on Hadoop 1.0.3 (specifically I'm using Amazon's EMR, AMI version
> 2.2.1). The crash is triggered by especially unbalanced reducer
> inputs, i.e., when one reducer receives too many records. (The reduce
> task gets retried three times, but since the data is the same every
> time, it crashes each time in the same place and the job fails.)
>
> From the following links:
>
> https://issues.apache.org/jira/browse/MAPREDUCE-1182
>
>
> http://hadoop-common.472056.n3.nabble.com/Shuffle-In-Memory-OutOfMemoryError-td433197.html
>
> it seems as though Hadoop is supposed to prevent this from happening
> by intelligently managing the amount of memory that is provided to the
> shuffle. However, I don't know how ironclad this guarantee is.
>
> Can anyone advise me on how robust I can expect Hadoop to be to this
> issue, in the face of highly unbalanced reducer inputs? Thanks very
> much for your time.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB