Hi Jason,

On May 22, 2013, at 3:35pm, Jason Weiss wrote:
OK, thanks. Sounds like you were pegged on CPU usage.

But that does surprise me a bit. Did you check that you were using all cores?

PS - back in 2006 I spent a week of hell debugging an occasion job failure on Hadoop (this is when it was still part of Nutch). Turns out one of our 12 slaves was accidentally using OpenJDK, and this had a JIT compiler bug that would occasionally rear its ugly head. Obviously the Sun/Oracle JRE isn't bug-free, but it gets a lot more stress testing. So one of my basic guidelines in the ops portion of the Hadoop class I teach is that every server must have exactly the same version of Oracle's JRE.
Ken Krugler
+1 530-210-6378
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB