Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Number of reducers


Copy link to this message
-
Re: Number of reducers
Ah, well, my bad. See instead the description for mapred.reduce.tasks
in mapred-default.xml, which states this: "Typically set to 99% of the
cluster's reduce capacity, so that if a node fails the reduces can
still be executed in a single wave."

FWIW, I set it manually to the level of parallelism I require (given
my partitioned data, etc.).

On Tue, Aug 28, 2012 at 8:43 PM, abhiTowson cal
<[EMAIL PROTECTED]> wrote:
> hi harsh,
>
> Thanks for the reply.I get your first and second points and coming to
> third point how is it specific to a job?
> My question was specific to job.
>
> Regards
> Abhishek
>
>
>
> On Mon, Aug 27, 2012 at 11:29 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> On Tue, Aug 28, 2012 at 8:32 AM, Abhishek <[EMAIL PROTECTED]> wrote:
>>> Hi all,
>>>
>>> I just want to know that, based on what factor map reduce framework decides number of reducers to launch for a job
>>
>> The framework does not auto-determine the number of reducers for a
>> job. That is purely user-or-client-program-supplied presently.
>>
>>> By default only one reducer will be launched for a given job is this right? If we explicitly does not mention number to launch via command line or driver class.
>>
>> Yes, by default the number of reduce tasks is configured to be one.
>>
>>> If i choose to decide number of reducers to mention explicitly, what should I consider.Because choosing in appropriate number of reducer hampers the performance.
>>
>> See http://wiki.apache.org/hadoop/HowManyMapsAndReduces
>>
>> --
>> Harsh J

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB