-Re: How does hadoop decide how many reducers to run?
Michael Segel 2013-01-11, 23:20
First, not enough information.
1) EC2 got it.
2) Which flavor of Hadoop? Is this EMR as well?
3) How many slots did you configure in your mapred-site.xml?
AWS EC2 cores aren't going to be hyperthreaded cores so 8 cores means that you will probably have 6 cores for slots.
With 16 reducers it sounds like you have 4 mappers and 4 reducers or 8 slots set up. (Over subscription is ok if you're not running HBase)
So what are you missing?
On Jan 11, 2013, at 4:59 PM, Roy Smith <[EMAIL PROTECTED]> wrote:
> I ran a big job the other day on a cluster of 4 m2.4xlarge EC2 instances. Each instance is 8 cores, so 32 cores total. Hadoop ran 16 reducers, followed by a second wave of 12. It seems to me it was only using half the available cores. Is this normal? Is there some way to force it to use all the cores?
> Roy Smith
> [EMAIL PROTECTED]