|
|
-
Need help optimizing reducer
Austin Chungath 2013-03-04, 19:57
Hi all,
I have 1 reducer and I have around 600 thousand unique keys coming to it. The total data is only around 30 mb. My logic doesn't allow me to have more than 1 reducer. It's taking too long to complete, around 2 hours. (till 66% it's fast then it slows down/ I don't really think it has started doing anything till 66% but then why does it show like that?). Are there any job execution parameters that can help improve reducer performace? Any suggestions to improve things when we have to live with just one reducer?
thanks, Austin
-
Re: Need help optimizing reducer
samir das mohapatra 2013-03-05, 06:23
Austin, I think you have to use partitioner to spawn more then one reducer for small data set. Default Partitioner will allow you only one reducer, you have to overwrite and implement you own logic to spawn more then one reducer. On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:
> Hi all, > > I have 1 reducer and I have around 600 thousand unique keys coming to it. > The total data is only around 30 mb. > My logic doesn't allow me to have more than 1 reducer. > It's taking too long to complete, around 2 hours. (till 66% it's fast then > it slows down/ I don't really think it has started doing anything till 66% > but then why does it show like that?). > Are there any job execution parameters that can help improve reducer > performace? > Any suggestions to improve things when we have to live with just one > reducer? > > thanks, > Austin >
-
Re: Need help optimizing reducer
Ajay Srivastava 2013-03-05, 06:30
Are you using combiner ? If not, that will be first thing to do. On 05-Mar-2013, at 1:27 AM, Austin Chungath wrote:
> Hi all, > > I have 1 reducer and I have around 600 thousand unique keys coming to it. The total data is only around 30 mb. > My logic doesn't allow me to have more than 1 reducer. > It's taking too long to complete, around 2 hours. (till 66% it's fast then it slows down/ I don't really think it has started doing anything till 66% but then why does it show like that?). > Are there any job execution parameters that can help improve reducer performace? > Any suggestions to improve things when we have to live with just one reducer? > > thanks, > Austin
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext