Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> increasing number of mappers.


Copy link to this message
-
increasing number of mappers.
I have 2 input seq files 32MB each. I want to run them on as many
mappers as possible.

i appended  -D mapred.max.split.size=1000000 as command line argument to
job, but there is no difference. Job still runs on 2 mappers.

How split size works? Is max split size used for reading or writing files?

it works like this?:  set maxsplitsize, write files and you will get
bunch of seq files as output. then you will get same number of mappers
as input files.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB