Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> mapred.tasktracker.map.tasks.maximum


Copy link to this message
-
RE: mapred.tasktracker.map.tasks.maximum
Mark,

The way I understand it...

# of mappers is calculated by deviding your input by file split size (64Mb default).

So, say your input is 64Gb in size so you will end up with 1000 mappers.

Slots are a different story. It is number of map tasks processed by a single node in parrallel. It is normally calculated by the number of cores that node is running on (one per core).

Rgds,

From: Mark Kerzner [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, November 13, 2012 10:20 AM
To: [EMAIL PROTECTED]
Subject: Re: mapred.tasktracker.map.tasks.maximum

1.0.1
On Tue, Nov 13, 2012 at 10:18 AM, Serge Blazhiyevskyy <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
What hadoop version are we talking about?

From: Mark Kerzner <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>>
Date: Tuesday, November 13, 2012 5:16 PM
To: Hadoop User <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]><mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>>
Subject: mapred.tasktracker.map.tasks.maximum

Hi,

I have a cluster with 4 nodes and 32 many cores on each. My default value for the maximum number of mappers per slot is 1:

  <property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <!-- see other kb entry about this one. -->
    <value>1</value>
    <final>true</final>
  </property>
(which I think is wrong).

Howeve, when sizable jobs run, I see 65 mappers working, so it seems that it does create more than 1 mapper per node.

Questions: what maximum number of mappers would be appropriate in this situation? Is that the right way to set them?

Thank you. Sincerely,
Mark

NOTICE: This e-mail message and any attachments are confidential, subject to copyright and may be privileged. Any unauthorized use, copying or disclosure is prohibited. If you are not the intended recipient, please delete and contact the sender immediately. Please consider the environment before printing this e-mail. AVIS : le pr?sent courriel et toute pi?ce jointe qui l'accompagne sont confidentiels, prot?g?s par le droit d'auteur et peuvent ?tre couverts par le secret professionnel. Toute utilisation, copie ou divulgation non autoris?e est interdite. Si vous n'?tes pas le destinataire pr?vu de ce courriel, supprimez-le et contactez imm?diatement l'exp?diteur. Veuillez penser ? l'environnement avant d'imprimer le pr?sent courriel
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB