Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: Questions with regard to scheduling of map and reduce tasks


+
Vasco Visser 2012-08-30, 19:03
Copy link to this message
-
Re: Questions with regard to scheduling of map and reduce tasks
The first scenario is expected behavior. And yes you should limit number
of the reducers.

Serge

On 8/30/12 10:41 AM, "Vasco Visser" <[EMAIL PROTECTED]> wrote:

>Hi,
>
>When running a job with more reducers than containers available in the
>cluster all reducers get scheduled, leaving no containers available
>for the mappers to be scheduled. The result is starvation and the job
>never finishes. Is this to be considered a bug or is it expected
>behavior? The workaround is to limit the number of reducers to less
>than the number of containers available.
>
>Also, it seems that from the combined pool of pending map and reduce
>tasks, randomly tasks are picked and scheduled. This causes less than
>optimal behavior. For example, I run a task with 500 mappers and 30
>reducers (my cluster has only 16 machines, two containters per machine
>(duo core machines)). What I observe is that half way through the job
>all reduce tasks are scheduled, leaving only one container for 200+
>map tasks. Again, is this expected behavior? If so, what is the idea
>behind it? And, are the map and reduce task indeed randomly scheduled
>or does it only look like they are?
>
>Any advice is welcome.
>
>Regards,
>Vasco
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB