Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # general >> how to restrict the concurrent running map tasks?


+
hwang 2013-01-18, 10:22
+
Harsh J 2013-01-18, 11:44
Copy link to this message
-
Re: how to restrict the concurrent running map tasks?
General is for product announcements and the like.  You really should
direct your question to mapreduce-user@.  I have bcced general.

I am not an expert on this, but I looked and it appears that you have to
use a special scheduler in the JobTracker to make this happen.

org.apache.hadoop.mapred.LimitTasksPerJobTaskScheduler
It looks a lot like the fifo scheduler but with a limit on the number of
tasks.  I am not sure it this is something that will work for you or not.

--Bobby

On 1/18/13 4:22 AM, "hwang" <[EMAIL PROTECTED]> wrote:

>Hi all:
>
>My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the
>same time. I have found 2 parameter related to this question.
>
>a) mapred.job.map.capacity
>
>but in my hadoop version, this parameter seems abandoned.
>
>b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob (
>http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector
>/1.0.2/mapred-default.xml
>)
>
>I set this variable like below:
>
>Configuration conf = new Configuration();
>conf.set("date", date);
>conf.set("mapred.job.queue.name", "hadoop");
>conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10");
>
>DistributedCache.createSymlink(conf);
>Job job = new Job(conf, "ConstructApkDownload_" + date);
>...
>
>The problem is that it doesn't work. There is still more than 50 maps
>running as the job starts.
>
>I'm not sure whether I set this parameter in wrong way ? or misunderstand
>it.
>
>After looking through the hadoop document, I can't find another parameter
>to limit the concurrent running map tasks.
>
>Hope someone can help me ,Thanks.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB