-Re: Need help optimizing reducer
Mahesh Balija 2013-03-05, 09:00
The reason why the reducer is fast upto 66% is be because of the Sorting
and Shuffling phase of the reduce and when the actual task is NOT yet
The reduce side is divided into 3 phases of 33~% each -> shuffle (fetch
data), sort and finally user-code (reduce). That is why your reduce might
be faster upto 66%. In order to speed up your program you may either have
to have more number of reducers or make your reducer code as optimized as
On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
> Hi all,
> I have 1 reducer and I have around 600 thousand unique keys coming to it.
> The total data is only around 30 mb.
> My logic doesn't allow me to have more than 1 reducer.
> It's taking too long to complete, around 2 hours. (till 66% it's fast then
> it slows down/ I don't really think it has started doing anything till 66%
> but then why does it show like that?).
> Are there any job execution parameters that can help improve reducer
> Any suggestions to improve things when we have to live with just one