Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How can I set the mapper number for pig script?

Sheng Guo 2012-06-23, 02:27
Jagat Singh 2012-06-23, 04:31
Sheng Guo 2012-06-23, 07:30
Copy link to this message
Re: How can I set the mapper number for pig script?
On Sat, Jun 23, 2012 at 3:30 AM, Sheng Guo <[EMAIL PROTECTED]> wrote:
> I know it is automatically set. But I have a large data set, I want it
> allocate more mappers during midnight so that more computing resource could
> be used to speed up.
> Any suggestions?

Pig uses CombineInputFormat by default which attempts to combine a set
of physical input splits into one logical input split.
I use the following setting to control the number of mappers in some
of my benchmarking scripts:

-- combine upto this many bytes into a composite input split, i.e., per mapper
SET pig.maxCombinedSplitSize 250000000;

Note that your are absolute min. is constrained by the smallest block
size in your input set.
Scott Foster 2012-06-23, 16:40
Sheng Guo 2012-06-23, 20:48
Yang 2012-06-23, 21:58
John Meagher 2012-06-23, 23:15
Scott Foster 2012-06-26, 23:47