On Sat, Jun 23, 2012 at 3:30 AM, Sheng Guo <[EMAIL PROTECTED]> wrote:
> I know it is automatically set. But I have a large data set, I want it
> allocate more mappers during midnight so that more computing resource could
> be used to speed up.
> Any suggestions?
Pig uses CombineInputFormat by default which attempts to combine a set
of physical input splits into one logical input split.
I use the following setting to control the number of mappers in some
of my benchmarking scripts:
-- combine upto this many bytes into a composite input split, i.e., per mapper
SET pig.maxCombinedSplitSize 250000000;
Note that your are absolute min. is constrained by the smallest block
size in your input set.