Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?


Copy link to this message
-
Re: Yarn never use TeraSort#TotalOrderPartitioner when run TeraSort job?
sam liu 2013-10-20, 08:19
After I took following actions, the job still could pass and seems all
TotalOrderPartitioner classes were not invoked at all:
- Modified libexec/hadoop-config.sh to put
hadoop-mapreduce-examples-2.0.4-alpha.jar in the front of hadoop classpath,
and it should ensure the TeraSort#
TotalOrderPartitioner will be invoked first
- Fiddled with org.apache.hadoop.mapreduce.TotalOrderPartitioner, and then
replace with the new generated
share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.0.4-alpha.jar
2013/10/19 Arun C Murthy <[EMAIL PROTECTED]>

> Apologies for the late response.
>
> In hadoop-2 TeraSort uses the new org.apache.hadoop.mapreduce apis (not
> org.apache.hadoop.mapred).
>
> Did you fiddle with the right TotalOrderPartitioner
> i.e. org.apache.hadoop.mapreduce.TotalOrderPartitioner?
>
> Arun
>
> On Oct 17, 2013, at 8:12 PM, sam liu <[EMAIL PROTECTED]> wrote:
>
> It's really weird and confusing me. Anyone can help this question?
>
> Thanks!
>
>
> 2013/10/16 sam liu <[EMAIL PROTECTED]>
>
>> Hi Experts,
>>
>> In Hadoop-2.0.4, the TeraSort leverage TeraSort#TotalOrderPartitioner as
>> its Partitioner: 'job.setPartitionerClass(TotalOrderPartitioner.class);'.
>> However, seems Yarn did not execute the methods of
>> TeraSort#TotalOrderPartitioner at all. I did some tests to verify it as
>> below:
>>
>> Test 1: Add some code in the method readPartitions() and setConf() in
>> TeraSort#TotalOrderPartitioner to print some words and write some word to a
>> file.
>> Expected Result: Some words should be printed and wrote into a file
>> Actual Result: No word was printed and wrote into a file at all
>>
>> Test 2: Remove all existing methods in TeraSort#TotalOrderPartitioner,
>> but only remaining some necessary but empty methods in it
>> Expected Result: TeraSort job will ocurr some exception, as the specified
>> Partitioner is not implemented at all
>> Actual Result: TeraSort job completed successfully without any exception
>>
>> Above tests confused me a lot, because seems Yarn never use specified
>> partitioner TeraSort#TotalOrderPartitioner at all during job execution.
>>
>> Any one can help provide the reasons?
>>
>> Thanks very much!
>>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.