Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Map Reduce jobs taking a long time at the end


Copy link to this message
-
Map Reduce jobs taking a long time at the end
Hey,

We are running Map reduce jobs against a 12 machine hbase cluster and
for a long time they took approx 30 mins to return a result against ~95
million rows. Without any major changes to the data or any upgrade of
hbase/hadoop they now seem to be taking about 4 hours. and the logs are
full of

2012-12-04 13:33:15,602 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 72 6f 75
67 68 74
...
2012-12-04 13:45:17,134 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 75 72 70
6c 65 64 65 73 69 67 6e 73 65 72 76 69 63 65 73
...
2012-12-04 13:46:11,515 INFO org.apache.hadoop.mapred.TaskTracker:
attempt_201211210952_0293_m_000031_0 0.0% row: 63 6f 6d 2e 70 75 73 68
74 6f 74 61 6c 6b 2d 6f 6e 6c 69 6e 65

I presume the 0% is percent complete but I'm not sure as to why the time
to complete has now jumped massively. Ganglia shows no major load on the
nodes in question so I don't think it's that.

What steps should I be taking to try troubleshoot the problem?

Regards,

Jay
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB