Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> Gzip progress during map phase.


Copy link to this message
-
Gzip progress during map phase.
Hi,

I noticed that the mapper progress indication in the hadoop cdh3
distribution jumps from 0% to 100% for each gzipped input file. So when
running with big gzipped input files the job appears to be stuck.

I was unable to find a jira issue that describes this effect.
Before I dive into this I have a few questions to you guys:
1) is this a known effect for the 0.20 version? If so what is the jira
issue?
2) is this specific to gzip?
3) is this effect still present in the MRv2/yarn version of Hadoop?

Thanks.
--
Met vriendelijke groet,
Niels Basjes
(Verstuurd vanaf mobiel )
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB