Has anyone experienced a TaskTracker/DataNode behaving like the attached
This was during a MR job (which runs often). Note the extremely high
System CPU time. Upon investigating I saw that out of 64GB ram the system
had allocated almost 45GB to cache!
I did a sudo sh -c "sync ; echo 3 > /proc/sys/vm/drop_cache ; sync" which
is roughly where the graph goes back to normal (much lower System, much
This has happened a few times.
I have tried playing with the sysctl vm.swappiness value (default of 60) by
setting it to 30 (which it was at when the graph was collected) and now to
10. I am not sure that helps.
Any ideas? Anyone else run into this before?
4x2TB sata3 hdd
Running Hadoop 1.0.4, with a DataNode (2gb heap), TaskTracker (2gb heap) on
24 map slots (1gb heap each), no reducers.
Also running HBase 0.94.2 with a RS (8gb ram) on this machine.