Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?


Copy link to this message
-
Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?
After I took following actions, the job still could pass and seems all
TotalOrderPartitioner classes were not invoked at all:
- Modified libexec/hadoop-config.sh to put
hadoop-mapreduce-examples-2.0.4-alpha.jar in the front of hadoop classpath,
and it should ensure the TeraSort#
TotalOrderPartitioner will be invoked first
- Fiddled with org.apache.hadoop.mapreduce.TotalOrderPartitioner, and then
replace with the new generated
share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.0.4-alpha.jar
2013/10/19 Arun C Murthy <[EMAIL PROTECTED]>

> Apologies for the late response.
>
> In hadoop-2 TeraSort uses the new org.apache.hadoop.mapreduce apis (not
> org.apache.hadoop.mapred).
>
> Did you fiddle with the right TotalOrderPartitioner
> i.e. org.apache.hadoop.mapreduce.TotalOrderPartitioner?
>
> Arun
>
> On Oct 17, 2013, at 8:12 PM, sam liu <[EMAIL PROTECTED]> wrote:
>
> It's really weird and confusing me. Anyone can help this question?
>
> Thanks!
>
>
> 2013/10/16 sam liu <[EMAIL PROTECTED]>
>
>> Hi Experts,
>>
>> In Hadoop-2.0.4, the TeraSort leverage TeraSort#TotalOrderPartitioner as
>> its Partitioner: 'job.setPartitionerClass(TotalOrderPartitioner.class);'.
>> However, seems Yarn did not execute the methods of
>> TeraSort#TotalOrderPartitioner at all. I did some tests to verify it as
>> below:
>>
>> Test 1: Add some code in the method readPartitions() and setConf() in
>> TeraSort#TotalOrderPartitioner to print some words and write some word to a
>> file.
>> Expected Result: Some words should be printed and wrote into a file
>> Actual Result: No word was printed and wrote into a file at all
>>
>> Test 2: Remove all existing methods in TeraSort#TotalOrderPartitioner,
>> but only remaining some necessary but empty methods in it
>> Expected Result: TeraSort job will ocurr some exception, as the specified
>> Partitioner is not implemented at all
>> Actual Result: TeraSort job completed successfully without any exception
>>
>> Above tests confused me a lot, because seems Yarn never use specified
>> partitioner TeraSort#TotalOrderPartitioner at all during job execution.
>>
>> Any one can help provide the reasons?
>>
>> Thanks very much!
>>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB