Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Multiple maps for a small input file


Copy link to this message
-
Re: Multiple maps for a small input file
Set mapred.min.split.size

D

On Mon, Apr 23, 2012 at 4:30 PM, Sam William <[EMAIL PROTECTED]> wrote:
> I have a file on  HDFS with a reduced block size.  I created this overriding the dfs.block.size param on the hadoop fs -put command .  hadoop fsck shows that this file has 15 blocks (as opposed to the normal 1 block) I did it so as to force Pig to use  more maps than normal .   On my pig command line   I  specify ' pig  -Dpig.splitCombination=false ' to turn off the defaulr split combination logic.   The jobs still ends running just one mapper.  How can I achieve  multiple maps ?    Splitting the original file into multiple files would be my last resort.
>
>
>
> Sam William
> [EMAIL PROTECTED]
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB