Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: How to find the num of Mappers


Copy link to this message
-
Re: How to find the num of Mappers
your question is answered here
http://wiki.apache.org/hadoop/HowManyMapsAndReduces

To answer first part of your question,

it is not mandatory to run all the maps of a given job at a single time.
Maps are executed as and when the map slots are available on the
tasktrackers
On Fri, Apr 12, 2013 at 1:51 PM, Sai Sai <[EMAIL PROTECTED]> wrote:

> If we have a 640 MB data file and have 3 Data Nodes in a cluster.
> The file can be split into 10 Blocks and starts the Mappers M1, M2,  M3
> first.
> As each one completes the task M4 and so on will be run.
> It appears like it is not necessary to run all the 10 Map tasks in
> parallel at once.
> Just wondering if this is right assumption.
> What if we have 10 TB of data file with 3 Data Nodes, how to find the
> number of mappers that will be created.
> Thanks
> Sai
>

--
Nitin Pawar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB