Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> M/R, Strange behavior with multiple Gzip files


Copy link to this message
-
M/R, Strange behavior with multiple Gzip files
Hi everybody,

I have a M/R job which does a bulk import to hbase.
I have to process many gzip files (2800 x ~ 100mb)

I don't understand why my job instanciates 80 maps but runs each map
sequentialy like if there is only one big gz file.

Is there a problem in my driver ? Or maybe I miss something.
I use "FileInputFormat.addInputPath(job, new Path(args[0]))" where args[0]
is a directory.

Can you help me, please ?

Thanks, Guillaume
+
Harsh J 2012-12-05, 17:33
+
x6i4uybz labs 2012-12-06, 16:25
+
Harsh J 2012-12-06, 16:39
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB