Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> control splits / mapper numbers?


Copy link to this message
-
control splits / mapper numbers?
this question was asked a few days before, after that, I dug around the
source code and found some knobs to turn,

but after adding the following, I still could not get pig to use smaller
split sizes

any ideas?

-- this does not work for pig
-- SET mapred.map.tasks 100;
SET pig.mapsplits.count 100;
SET mapred.max.jobs.per.node 1;
--- following changed after 0.21
--- SET mapred.min.split.size  100;
SET mapreduce.input.fileinputformat.split.minsize 100;
SET mapreduce.input.fileinputformat.split.maxsize 100;
SET pig.splitCombination false;
SET pig.noSplitCombination true;
SET pig.maxCombinedSplitSize 100;