Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Slow shuffle stage?


Copy link to this message
-
Re: Slow shuffle stage?
892 nodes, 4 tasks each, 3:1 mapper/reducer ratio.  Each map task outputs four records, ~18MB each.  They are fairly evenly distributed to the 17 reducers.  As to the bandwidth of the cluster, I don't really know.  I'll look into that.

On Nov 10, 2011, at 7:07 PM, Prashant Sharma wrote:

> Can you tell us about your cluster, Is it single node? how big is your data
> then.? Or the bandwidth between nodes. (cause copy might take time in that
> case)
> -P
>
> On Fri, Nov 11, 2011 at 6:50 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:
>
>> What sorts of causes might be responsible for a long or slow shuffle
>> stage?  For example, I have a job of 266 maps (each emitting 4 records) and
>> 17 reduces (each ingesting about 60 records) that takes 72 minutes to
>> complete.  The maps tend to run in about 9-13 minutes (the value in
>> parentheses under the Finish Time column of the map task list in the job
>> tracker and the reduces run in about 37 minutes (same column).  If I click
>> into a specific reduce task, I see a Finish Time of 37 minutes of course,
>> and a Shuffle time of about 27 minutes.
>>
>> So, 11 minutes were spent in the maps, 10 in the reduces, and 27
>> shuffling.  Note that the 72 minute overall job time is considerably longer
>> than the sum of these three averages because of a few outlier maps (25
>> minutes, one even took 37 minutes) that held up the later stages).
>>
>> Disregarding the outliers, it's still spending more than 50% of the job
>> time (27 out of 48 minutes) shuffling instead of doing actual computation
>> in the maps and reducers.  This feels inefficient to me.
>>
>> What causes this and what can be done to improve it?
>>
>> Thanks.
________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"What I primarily learned in grad school is how much I *don't* know.
Consequently, I left grad school with a higher ignorance to knowledge ratio than
when I entered."
                                           --  Keith Wiley
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB