Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Slow shuffle stage?


Copy link to this message
-
Re: Slow shuffle stage?
Joey Echeverria 2011-11-11, 16:43
You can enable speculative execution on a per-job basis, assuming your
administrator hasn't marked it final. The parameter you want is
mapred.map.tasks.speculative.execution.

-Joey

On Fri, Nov 11, 2011 at 10:57 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
> Oh, so the reported shuffle time incorporate the mapper time.  I suppose that makes sense since the two stages overlap.  That explains a lot.
>
> As to the map times, it mostly due to task attempt failures.  Any attempt that fails costs a lot of time.  It is almost never a data-driven error, so subsequent attempts succeed, but it costs time when the earlier attempts fail.  The cause is usually some cluster hairiness that I can neither control (I'm not the administrator) or barely comprehend: failure to retrieve blocks, random err 141 and 139 failures, that sort of thing.
>
> I'll check on the speculative execution.  I can't remember...I think so though.
>
> On Nov 11, 2011, at 5:53 AM, Joey Echeverria wrote:
>
>> Another thing to look at is the map outlier. The shuffle will start by default when 5% of the maps are done, but won't finish until after the last map is done. Since one of your maps took 37 minutes, your shuffle take at least that long.
>>
>> I would check the following:
>>
>> Is the input skewed?
>> Does the same node always result in the 37 minute map task?
>> Is speculative execution turned on?
>>
>> -Joey
>>
>>
>>
>> On Nov 10, 2011, at 22:07, Prashant Sharma <[EMAIL PROTECTED]> wrote:
>>
>>> Can you tell us about your cluster, Is it single node? how big is your data
>>> then.? Or the bandwidth between nodes. (cause copy might take time in that
>>> case)
>>> -P
>>>
>>> On Fri, Nov 11, 2011 at 6:50 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
>>>
>>>> What sorts of causes might be responsible for a long or slow shuffle
>>>> stage?  For example, I have a job of 266 maps (each emitting 4 records) and
>>>> 17 reduces (each ingesting about 60 records) that takes 72 minutes to
>>>> complete.  The maps tend to run in about 9-13 minutes (the value in
>>>> parentheses under the Finish Time column of the map task list in the job
>>>> tracker and the reduces run in about 37 minutes (same column).  If I click
>>>> into a specific reduce task, I see a Finish Time of 37 minutes of course,
>>>> and a Shuffle time of about 27 minutes.
>>>>
>>>> So, 11 minutes were spent in the maps, 10 in the reduces, and 27
>>>> shuffling.  Note that the 72 minute overall job time is considerably longer
>>>> than the sum of these three averages because of a few outlier maps (25
>>>> minutes, one even took 37 minutes) that held up the later stages).
>>>>
>>>> Disregarding the outliers, it's still spending more than 50% of the job
>>>> time (27 out of 48 minutes) shuffling instead of doing actual computation
>>>> in the maps and reducers.  This feels inefficient to me.
>>>>
>>>> What causes this and what can be done to improve it?
>>>>
>>>> Thanks.
>
>
> ________________________________________________________________________________
> Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com
>
> "It's a fine line between meticulous and obsessive-compulsive and a slippery
> rope between obsessive-compulsive and debilitatingly slow."
>                                           --  Keith Wiley
> ________________________________________________________________________________
>
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434