Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> control splits / mapper numbers?


Copy link to this message
-
control splits / mapper numbers?
this question was asked a few days before, after that, I dug around the
source code and found some knobs to turn,

but after adding the following, I still could not get pig to use smaller
split sizes

any ideas?

-- this does not work for pig
-- SET mapred.map.tasks 100;
SET pig.mapsplits.count 100;
SET mapred.max.jobs.per.node 1;
--- following changed after 0.21
--- SET mapred.min.split.size  100;
SET mapreduce.input.fileinputformat.split.minsize 100;
SET mapreduce.input.fileinputformat.split.maxsize 100;
SET pig.splitCombination false;
SET pig.noSplitCombination true;
SET pig.maxCombinedSplitSize 100;
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB