Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> number of mapred slots


+
alxsss@... 2012-12-18, 06:08
+
Chris Embree 2012-12-18, 06:12
Copy link to this message
-
Re: number of mapred slots
I have two slave nodes and one master. One slave node has quad core(4 cpus)(16GB mem) the other slave has  dual core (2 cpus) (16 GB mem) and master has dual core  4GB mem. I run hadoop and hbase. So, both slaves have already 4 processes (datanode, tasktracker, hbase regionserver and zookepper) and  I have this config in mapred-side.xml

<property>
  <name>mapred.tasktracker.map.tasks.maximum</name>
  <value>10</value>
   <description>the number of available cores on the tasktracker machines
for map tasks
  </description>
</property>

<property>
  <name>mapred.tasktracker.reduce.tasks.maximum</name>
  <value>7</value>
   <description>the number of available cores on the tasktracker machines
for reduce tasks
  </description>
</property>
<property>
  <name>mapred.map.tasks</name>
  <value>10</value>
  <description>
    define mapred.map tasks to be number of slave hosts
  </description>
</property>

<property>
  <name>mapred.reduce.tasks</name>
  <value>7</value>
  <description>
    define mapred.reduce tasks to be number of slave hosts
  </description>
</property>
  
 

 To my understanding this means that number of reduce tasks must be 7. However, hadoop scheduled 10 reducers and all of them started at once. There was no pending reducers. Can anyone explain, why 10 reducers were running and where those slots come from, if there were 6 cpus and 8 processes already running in slave nodes.

Thanks.
Alex.
  

 

-----Original Message-----
From: Chris Embree <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Mon, Dec 17, 2012 10:12 pm
Subject: Re: number of mapred slots
I think the rule of thumb (hortonworks at least) is 2x cores for maps threads and 1x cores for reducers.  Don't have my notes here so I'm not 100%.  It's just a guideline in any event. :)
TEST, TEST, TEST.  :)
On Tue, Dec 18, 2012 at 1:08 AM,  <[EMAIL PROTECTED]> wrote:

Hello,

I was unable to find any information regarding relationship between mapred slots and number of cpus on the net. All I found was that it is advisable to schedule two processes for one cpu.  If this is true, then for a slave  node with dual core( two cpus) that runs datanode, tasktracker, hbase regionserver and zookeeper, theoretically there is no space to run an additional mapred task. Any comment on this is welcome.

In general what is the mapred slot and how is it related to number of cpu cores?

Thanks in advance.
Alex.

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB