Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?


+
sam liu 2013-10-16, 02:02
+
sam liu 2013-10-18, 03:12
+
Arun C Murthy 2013-10-18, 21:03
Copy link to this message
-
Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?
Sam, I would guess that the jar file you think is running, is not actually the one. I am guessing that in the task classpath, there is a normal jar file (without your changes) which is being picked up before your modified jar file.

On Thursday, October 17, 2013 10:13 PM, sam liu <[EMAIL PROTECTED]> wrote:
 
It's really weird and confusing me. Anyone can help this question?

Thanks!
2013/10/16 sam liu <[EMAIL PROTECTED]>

Hi Experts,
>
>In Hadoop-2.0.4, the TeraSort leverage TeraSort#TotalOrderPartitioner as its Partitioner: 'job.setPartitionerClass(TotalOrderPartitioner.class);'. However, seems Yarn did not execute the methods of TeraSort#TotalOrderPartitioner at all. I did some tests to verify it as below:
>
>Test 1: Add some code in the method readPartitions() and setConf() in TeraSort#TotalOrderPartitioner to print some words and write some word to a file.
>Expected Result: Some words should be printed and wrote into a file
>Actual Result: No word was printed and wrote into a file at all
>
>Test 2: Remove all existing methods in TeraSort#TotalOrderPartitioner, but only remaining some necessary but empty methods in it
>
Expected Result: TeraSort job will ocurr some exception, as the specified Partitioner is not implemented at all
>Actual Result: TeraSort job completed successfully without any exception
>
>Above tests confused me a lot, because seems Yarn never use specified partitioner TeraSort#TotalOrderPartitioner at all during job execution.
>
>Any one can help provide the reasons?
>
>Thanks very much!
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB