Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Increase the number of mappers in PM mode


Copy link to this message
-
Re: Increase the number of mappers in PM mode
HI:
  i get these interview questions  by doing some googles:

Q29. How can you set an arbitary number of mappers to be created for a job
in Hadoop

This is a trick question. You cannot set it

 >> The above test proves you cannot  an arbitary number of mappers .

Q30. How can you set an arbitary number of reducers to be created for a job
in Hadoop

You can either do it progamatically by using method setNumReduceTasksin the
JobConfclass or set it up as a configuration setting
 I test the Q30,it seems right.

 my logs:

[hadoop@Hadoop01 bin]$./hadoop  jar
 ../share/hadoop/mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar
wordcount -D mapreduce.job.reduces=2  -D mapreduce.jobtracker.address10.167.14.221:50030 /user/hadoop/yyp/input /user/hadoop/yyp/output3

==================================
Job Counters

Launched map tasks=1

Launched reduce tasks=2 -----> it actually changed .

Rack-local map tasks=1

Total time spent by all maps in occupied slots (ms)=60356

Total time spent by all reduces in occupied slots (ms)=135224

===========================

regards

2013/3/14 YouPeng Yang <[EMAIL PROTECTED]>

> Hi
>   the docs only have a property
> : mapreduce.input.fileinputformat.split.minsize (default value is 0)
>   does it matter?
>
>
>
> 2013/3/14 Zheyi RONG <[EMAIL PROTECTED]>
>
>> Have you considered change mapred.max.split.size ?
>> As in:
>> http://stackoverflow.com/questions/9678180/change-file-split-size-in-hadoop
>>
>> Zheyi
>>
>>
>> On Thu, Mar 14, 2013 at 3:27 PM, YouPeng Yang <[EMAIL PROTECTED]>wrote:
>>
>>> Hi
>>>
>>>
>>>   I have done some tests in my  Pseudo Mode(CDH4.1.2)with MV2 yarn,and
>>>  :
>>>   According to the doc:
>>>   *mapreduce.jobtracker.address :*The host and port that the MapReduce
>>> job tracker runs at. If "local", then jobs are run in-process as a single
>>> map and reduce task.
>>>   *mapreduce.job.maps (default value is 2)* :The default number of map
>>> tasks per job. Ignored when mapreduce.jobtracker.address is "local".
>>>
>>>   I changed the mapreduce.jobtracker.address = Hadoop:50031.
>>>
>>>   And then run the wordcount examples:
>>>   hadoop jar  hadoop-mapreduce-examples-2.0.0-cdh4.1.2.jar wordcount
>>> input output
>>>
>>>   the output logs are as follows:
>>>         ....
>>>    Job Counters
>>> Launched map tasks=1
>>>  Launched reduce tasks=1
>>> Data-local map tasks=1
>>>  Total time spent by all maps in occupied slots (ms)=60336
>>> Total time spent by all reduces in occupied slots (ms)=63264
>>>      Map-Reduce Framework
>>> Map input records=5
>>>  Map output records=7
>>> Map output bytes=56
>>> Map output materialized bytes=76
>>>         ....
>>>
>>>  i seem to does not work.
>>>
>>>  I thought maybe my input file is small-just 5 records . is it right?
>>>
>>> regards
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2013/3/14 Sai Sai <[EMAIL PROTECTED]>
>>>
>>>>
>>>>
>>>>  In Pseudo Mode where is the setting to increase the number of mappers
>>>> or is this not possible.
>>>> Thanks
>>>> Sai
>>>>
>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB