Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> RE: issue about Shuffled Maps in MR job summary


Copy link to this message
-
Re: issue about Shuffled Maps in MR job summary
i read the doc, and find if i have 8 reducer ,a map task will output 8
partition ,each partition will be send to a different reducer, so if i
increase reduce number ,the partition number increase ,but the volume on
network traffic is same,why sometime ,increase reducer number will not
decrease job complete time ?

On Wed, Dec 11, 2013 at 1:48 PM, Vinayakumar B <[EMAIL PROTECTED]>wrote:

>  It looks simple, J
>
>
>
> Shuffled Maps= Number of Map Tasks * Number of Reducers
>
>
>
> Thanks and Regards,
>
> Vinayakumar B
>
>
>
> *From:* ch huang [mailto:[EMAIL PROTECTED]]
> *Sent:* 11 December 2013 10:56
> *To:* [EMAIL PROTECTED]
> *Subject:* issue about Shuffled Maps in MR job summary
>
>
>
> hi,maillist:
>
>            i run terasort with 16 reducers and 8 reducers,when i double
> reducer number, the Shuffled maps is also double ,my question is the job
> only run 20 map tasks (total input file is 10,and each file is 100M,my
> block size is 64M,so split is 20) why i need shuffle 160 maps in 8 reducers
> run and 320 maps in 16 reducers run?how to caculate the shuffle maps number?
>
>
>
> 16 reducer summary output:
>
>
>
>
>
>  Shuffled Maps =320
>
>
>
> 8 reducer summary output:
>
>
>
> Shuffled Maps =160
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB