Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> LocalJobRunner and # of reducers


+
Dmitriy Lyubimov 2011-05-02, 22:53
+
jason 2011-05-02, 23:47
+
Dmitriy Lyubimov 2011-05-03, 00:02
+
jason 2011-05-03, 00:13
Copy link to this message
-
Re: LocalJobRunner and # of reducers
See also https://issues.apache.org/jira/browse/MAPREDUCE-434 which has
a patch for this issue.

Cheers,
Tom

On Mon, May 2, 2011 at 5:13 PM, jason <[EMAIL PROTECTED]> wrote:
> I am attaching the originals so you could figure out the diffs on your own :)
>
> On 5/2/11, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote:
>> Thanks a bunch!
>>
>> (is there any chance you could do a diff only ? )
>>
>> -d
>>
>> On Mon, May 2, 2011 at 4:47 PM, jason <[EMAIL PROTECTED]> wrote:
>>> Dmitriy,
>>>
>>> I remember I had the same problem with local jobs when I tried to
>>> debug my multi-reducer use cases. So had to create this small patch
>>> that resolves the issue.
>>> You can put these classes into org.apache.hadoop.mapred package in
>>> your local project and make sure they preceed Hadoop's jars in the
>>> class path.
>>>
>>> My patch is based on Cloudera 0.20.2+320 release.
>>>
>>> Hope this helps.
>>>
>>>
>>> On 5/2/11, Dmitriy Lyubimov <[EMAIL PROTECTED]> wrote:
>>>> Hi,
>>>>
>>>> i was trying to create a test based on mapreduce job in a local mode
>>>> testing various partitioning issues.
>>>>
>>>> But curiously, whenever i switch mapreduce into local node, i can't
>>>> seem to be able to configure multiple reduce tasks.
>>>>
>>>> Indeed, upon some investigation i found that the following fragment in
>>>> LocalJobRunner resets all reducers to 1 :
>>>>
>>>> /* 177 */         int numReduceTasks = this.job.getNumReduceTasks();
>>>> /* 178 */         if ((numReduceTasks > 1) || (numReduceTasks < 0))
>>>> /*     */         {
>>>> /* 180 */           numReduceTasks = 1;
>>>> /* 181 */           this.job.setNumReduceTasks(1);
>>>> /*     */         }
>>>> /* 183 */         outputCommitter.setupJob(jContext);
>>>> /* 184 */         this.status.setSetupProgress(1.0F);
>>>> /*     */
>>>> /* 186 */         Map mapOutputFiles = new HashMap();
>>>> /*     */
>>>>
>>>>
>>>> Is this a fundamental limitation of the local mapreduce mode? what if
>>>> i need to write up a unit test that checks various partitioning
>>>> functions? Is there a workaround?
>>>>
>>>> Also, i don't remember these problems when writing tests based on
>>>> local mapreduce in previous versions (this is cdh3b4) , although i
>>>> cannot be sure if i ran into exactly same situation before.
>>>>
>>>> thanks.
>>>> -Dmitriy
>>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB