Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Need help optimizing reducer


Copy link to this message
-
Re: Need help optimizing reducer
Austin,
  I think  you have to use partitioner to spawn more then one reducer for
small data set.
  Default Partitioner will allow you only one reducer, you have to
overwrite and implement you own logic to spawn more then one reducer.
On Tue, Mar 5, 2013 at 1:27 AM, Austin Chungath <[EMAIL PROTECTED]> wrote:

> Hi all,
>
> I have 1 reducer and I have around 600 thousand unique keys coming to it.
> The total data is only around 30 mb.
> My logic doesn't allow me to have more than 1 reducer.
> It's taking too long to complete, around 2 hours. (till 66% it's fast then
> it slows down/ I don't really think it has started doing anything till 66%
> but then why does it show like that?).
> Are there any job execution parameters that can help improve reducer
> performace?
> Any suggestions to improve things when we have to live with just one
> reducer?
>
> thanks,
> Austin
>