-Re: possibly Pig throttles the number of mappers
Dexin Wang 2011-03-24, 00:58
We are using 0.79. Also got an answer from #hadoop channel and with this
will look into combining more work in each mapper and/or use Pig 0.8.
Thanks again for your help.
On Wed, Mar 23, 2011 at 5:55 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
> What version of Pig are you using? Starting in 0.8 Pig will combine small
> blocks into a single map. This prevents jobs that actually are reading
> small amounts of data from taking a lot of slots on the cluster. You can
> turn this off by adding -Dpig.noSplitCombination=true to your command line.
> On Mar 23, 2011, at 5:45 PM, Dexin Wang wrote:
> And the nodes are pretty lightly loaded (~1.0) and there's plenty of free
>> memory. Now I'm seeing 2 mappers per node. Very much under-utilized.
>> On Wed, Mar 23, 2011 at 1:39 PM, Dexin Wang <[EMAIL PROTECTED]> wrote:
>>> We've seen a strange problem where some Pig jobs would just run fewer
>>> mappers concurrently than the mapper capacity. Specifically we have a 10
>>> node cluster and each is configured to have 12 mappers. Normally we have
>>> mappers running. But for some Pig jobs it will only have 10 mappers
>>> (while nothing else is running), and actually appears to be 1 mapper per
>>> We have not noticed the same problem with other non-Pig hadoop job.
>>> has experienced the same thing and have any explanation or remedy?