Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> A small portion of map tasks slows down the job


Copy link to this message
-
Re: A small portion of map tasks slows down the job
This is reasonable if you have any kind of trends  in the ordering of your data or  any computation in the mappers.

You can use a smaller input split to
Reduce the load on each individual mapper so that large blocks of records that take a long time To Process are less likely to clog one mapper.

Jay Vyas
MMSB
UCHC

On Oct 2, 2012, at 9:04 PM, Huanchen Zhang <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I have a small portion of map tasks whose output is much larger than others (more spills). So the reducer is mainly waiting for these a few map tasks. Is there a good solution for this problem ?
>
> Thank you.
>
> Best,
> Huanchen
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB