Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # general - how to restrict the concurrent running map tasks?


+
hwang 2013-01-18, 10:22
+
Harsh J 2013-01-18, 11:44
Copy link to this message
-
Re: how to restrict the concurrent running map tasks?
Robert Evans 2013-01-18, 15:49
General is for product announcements and the like.  You really should
direct your question to mapreduce-user@.  I have bcced general.

I am not an expert on this, but I looked and it appears that you have to
use a special scheduler in the JobTracker to make this happen.

org.apache.hadoop.mapred.LimitTasksPerJobTaskScheduler
It looks a lot like the fifo scheduler but with a limit on the number of
tasks.  I am not sure it this is something that will work for you or not.

--Bobby

On 1/18/13 4:22 AM, "hwang" <[EMAIL PROTECTED]> wrote:

>Hi all:
>
>My hadoop version is 1.0.2. Now I want at most 10 map tasks running at the
>same time. I have found 2 parameter related to this question.
>
>a) mapred.job.map.capacity
>
>but in my hadoop version, this parameter seems abandoned.
>
>b) mapred.jobtracker.taskScheduler.maxRunningTasksPerJob (
>http://grepcode.com/file/repo1.maven.org/maven2/com.ning/metrics.collector
>/1.0.2/mapred-default.xml
>)
>
>I set this variable like below:
>
>Configuration conf = new Configuration();
>conf.set("date", date);
>conf.set("mapred.job.queue.name", "hadoop");
>conf.set("mapred.jobtracker.taskScheduler.maxRunningTasksPerJob", "10");
>
>DistributedCache.createSymlink(conf);
>Job job = new Job(conf, "ConstructApkDownload_" + date);
>...
>
>The problem is that it doesn't work. There is still more than 50 maps
>running as the job starts.
>
>I'm not sure whether I set this parameter in wrong way ? or misunderstand
>it.
>
>After looking through the hadoop document, I can't find another parameter
>to limit the concurrent running map tasks.
>
>Hope someone can help me ,Thanks.