Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> RE: issue about Shuffled Maps in MR job summary


Copy link to this message
-
Re: issue about Shuffled Maps in MR job summary
i read the doc, and find if i have 8 reducer ,a map task will output 8
partition ,each partition will be send to a different reducer, so if i
increase reduce number ,the partition number increase ,but the volume on
network traffic is same,why sometime ,increase reducer number will not
decrease job complete time ?

On Wed, Dec 11, 2013 at 1:48 PM, Vinayakumar B <[EMAIL PROTECTED]>wrote:

>  It looks simple, J
>
>
>
> Shuffled Maps= Number of Map Tasks * Number of Reducers
>
>
>
> Thanks and Regards,
>
> Vinayakumar B
>
>
>
> *From:* ch huang [mailto:[EMAIL PROTECTED]]
> *Sent:* 11 December 2013 10:56
> *To:* [EMAIL PROTECTED]
> *Subject:* issue about Shuffled Maps in MR job summary
>
>
>
> hi,maillist:
>
>            i run terasort with 16 reducers and 8 reducers,when i double
> reducer number, the Shuffled maps is also double ,my question is the job
> only run 20 map tasks (total input file is 10,and each file is 100M,my
> block size is 64M,so split is 20) why i need shuffle 160 maps in 8 reducers
> run and 320 maps in 16 reducers run?how to caculate the shuffle maps number?
>
>
>
> 16 reducer summary output:
>
>
>
>
>
>  Shuffled Maps =320
>
>
>
> 8 reducer summary output:
>
>
>
> Shuffled Maps =160
>