Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # dev - spill taking too long


Copy link to this message
-
spill taking too long
Ted Yu 2010-11-04, 18:36
Hi,
We use cdh3b2.
We sometimes see maptask timeout. Here is log from one of the maptasks:

2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: buffer full= true
2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 119534169; bufend = 59763857; bufvoid = 298844160
2010-11-04 10:34:23,820 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 438913; kvend = 585320; length = 983040
2010-11-04 10:34:41,615 INFO org.apache.hadoop.mapred.MapTask: Finished spill 3
2010-11-04 10:35:45,352 INFO org.apache.hadoop.mapred.MapTask:
Spilling map output: buffer full= true
2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask:
bufstart = 59763857; bufend = 298837899; bufvoid = 298844160
2010-11-04 10:35:45,547 INFO org.apache.hadoop.mapred.MapTask: kvstart
= 585320; kvend = 731585; length = 983040
2010-11-04 10:45:41,289 INFO org.apache.hadoop.mapred.MapTask: Finished spill 4

Note how long the last spill took.

Can someone provide hint on what might be the reason ?

Thanks