Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Performance tuning of sort


Copy link to this message
-
Re: Performance tuning of sort
The input of each reducer is not same, it depends on the input data
distribution and Partitioner.
And the running time of each reducer consist of three phases: copy,
sort and reducer.
2010/6/18 李钰 <[EMAIL PROTECTED]>:
> Hi Todd and Jeff,
>
> Thanks a lot for your discussion, it's really helpful to me. I'd like to
> express my especial appreciation for Todd's patient explanation, you help me
> see more clearly about the working mechanism of SORT. And Jeff, really thank
> you for reminding me that sort uses TotalOrderPartitioner to do
> partitioning.
> Based on your discussion I update my understanding as follows:
> The sorting happens on the map side during the spill process of each map
> task, after that, the overall map outputs are partitioned by method of
> TotalOrderPartitioner, this decides the input range of each reducer.
> Reducers get map outputs as decided by the partitioner, and do merging and
> write results into HDFS.
> Is this understanding right? Please correct me if you find any faults,
> thanks.
> If this understanding is right, then my question rolls back to the original
> one: Since the scale of input data and operations of each reduce task is the
> same, what may cause the execution time of reduce tasks different? All nodes
> used in my experiment are on the same rack, and they are homogenous.
> Any suggesion will be highly appreciated, thanks.
>
> Best Regards,
> Carp
>
> 2010/6/18 Todd Lipcon <[EMAIL PROTECTED]>
>
>> On Thu, Jun 17, 2010 at 9:37 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
>>
>> > Todd,
>> >
>> > Why's there a sorting in map task, the sorting here seems useless in my
>> > opinion.
>> >
>> >
>> For map-only jobs there isn't. For jobs with reduce, typically the number
>> of
>> reduce tasks is smaller than the number of map tasks, so parallelizing the
>> sort on the mappers and just doing merge on the reducers is beneficial.
>> Second, this allows the combiner to run on the mapper by identifying when
>> it
>> has multiple outputs for the same key. Third, this allows improved
>> compression on the map output (thus less intermediate data transfer) by
>> putting similar keys near each other (hopefully within the compression
>> window). Fourth, it kills two birds with one stone since the mappers
>> already
>> have to group outputs by the partition.
>>
>> -Todd
>>
>>
>> >
>> >
>> > On Thu, Jun 17, 2010 at 9:26 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
>> > > On Thu, Jun 17, 2010 at 12:43 AM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
>> > >
>> > >> Your understanding of Sort is not right. The key concept of Sort is
>> > >> the TotalOrderPartitioner. Actually before the map-reduce job, client
>> > >> side will do sampling of input data to estimate the distribution of
>> > >> input data. And the mapper do nothing, each reducer will fetch its
>> > >> data according the TotalOrderPartitioner. The data in each reducer is
>> > >> local sorted, and each reducer are sorted ( r0<r1<r2....), so the
>> > >> overall result data is sorted.
>> > >>
>> > >
>> > > The sorting happens on the map side, actually, during the spill
>> process.
>> > The
>> > > mapper itself is an identity function, but the map task code does
>> perform
>> > a
>> > > sort (on a <partition,key> tuple) as originally described in this
>> thread.
>> > > Reducers just do a merge of mapper outputs.
>> > >
>> > > -Todd
>> > >
>> > >
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Jun 17, 2010 at 12:13 AM, 李钰 <[EMAIL PROTECTED]> wrote:
>> > >> > Hi all,
>> > >> >
>> > >> > I'm doing some tuning of the sort benchmark of hadoop. To be more
>> > >> specified,
>> > >> > running test against the org.apache.hadoop.examples.Sort class. As
>> > >> looking
>> > >> > through the source code, I think the map tasks take responsibility
>> of
>> > >> > sorting the input data, and the reduce tasks just merge the map
>> > outputs
>> > >> and
>> > >> > write them into HDFS. But here I've got a question I couldn't
>> > understand:
>> > >> > the time cost of the reduce phase of each reduce task, that is

Best Regards

Jeff Zhang