This topic was discussed two years ago.
On Fri, Jun 11, 2010 at 8:45 AM, Edward Capriolo <[EMAIL PROTECTED]>wrote:
> On Fri, Jun 11, 2010 at 8:35 AM, Sébastien Rainville <
> [EMAIL PROTECTED]> wrote:
> > Hi,
> > I'm playing around with the hadoop config to optimize the resources of
> > cluster. I'm noticing that the cpu usage is sub-optimal. All the machines
> > in
> > the cluster have 1 quad core cpu. I looked at our
> > mapred.tasktracker.map.tasks.maximum
> > and mapred.tasktracker.reduce.tasks.maximum settings and the max map
> > is set to 2 and the max reduce tasks is set to 1, keeping 1 cpu for
> > the database (Cassandra) and the OS.
> > My question is: why separating the settings for the map tasks and reduce
> > tasks? I feel like what I want is to set
> > mapred.tasktracker.tasks.maximum=3,
> > so that all the cpus are always available for both map and reduce tasks.
> > Am I missing something?
> > Thanks,
> > Sebastien
> That suggestion makes sense. As you run more concurrent jobs you may find
> that having dedicated slots for reduce tasks is useful. You would not want
> cluster running 600 mappers and 0 reducers :)