Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Cluster config: Mapper:Reducer Task Capapcity


+
Himanshu Vijay 2013-09-30, 19:39
Copy link to this message
-
Re: Cluster config: Mapper:Reducer Task Capapcity
Sandy Ryza 2013-09-30, 19:52
Hi Himanshu,

Changing the ratio is definitely a reasonable thing to do.  The capacities
come from the mapred.tasktracker.map.tasks.maximum
and mapred.tasktracker.reduce.tasks.maximum tasktracker configurations.
 You can tweak these on your nodes to get your desired ratio.

-Sandy
On Mon, Sep 30, 2013 at 12:39 PM, Himanshu Vijay <[EMAIL PROTECTED]>wrote:

> Hi,
>
> Our Hadoop cluster is running 0.20.203. The cluster currently has 'Map
> Task Capacity' of 8900+ 'Reduce Task Capacity' of 3300+ resulting in a
> ratio of 2.7. We have a lot of variety of jobs running and we want to
> increase the throughput.
>
> My manual observation was that we hit the Mapper capacity and hence many
> jobs have to wait even though lot of room left in Reduce capacity. I mined
> the jobtracker logs for the jobs that completed and saw that on a hourly
> basis as well as daily basis the mapper:reducer ratio was 4-5.
>
> To increase the throughput I was thinking that I experiment changing the
> Map and Reducer Task Capacity such that the ratio is increased from 2.7 to
> ~4.
>
> Does this sound like a correct approach ? Is this something that I can
> control or it's determined automatically by Hadoop ?
>
> Have any of you done this kind of exercise ? If yes can you please direct
> how to go about changing this ratio. I am not finding much literature on
> it.
>
> Note: Mapper and ReducerTask Capacity is the max total no. of
> mappers/reducers you can run on the cluster at any point.
>
> Regards,
> -Himanshu Vijay
>
+
Himanshu Vijay 2013-10-01, 07:06