-Re: Strange machine behavior
Robert Dyer 2012-12-09, 02:10
Yes but even with a MR running, it is only 36GB heap total out of 64GB
ram. This leaves plenty for OS and caching.
The problem seems to be the OS preferring to cache over giving space to the
applications. Once I drop the caches and rerun the MR job again several
times, it runs perfectly fine.
On Dec 8, 2012 7:06 PM, "Marcos Ortiz" <[EMAIL PROTECTED]> wrote:
> Are you sure that 24 map slots is a good number for this machine?
> Remember that you have three services (DN, TT and HRegionServer) with
> with a 12 GB for Heap.
> Try to use a lower number of map slots (12 for example) and launch your
> MR job again.
> Can you share your logs in pastebin?
> On Sat 08 Dec 2012 07:09:02 PM CST, Robert Dyer wrote:
>> Has anyone experienced a TaskTracker/DataNode behaving like the
>> attached image?
>> This was during a MR job (which runs often). Note the extremely high
>> System CPU time. Upon investigating I saw that out of 64GB ram the
>> system had allocated almost 45GB to cache!
>> I did a sudo sh -c "sync ; echo 3 > /proc/sys/vm/drop_cache ; sync"
>> which is roughly where the graph goes back to normal (much lower
>> System, much higher User).
>> This has happened a few times.
>> I have tried playing with the sysctl vm.swappiness value (default of
>> 60) by setting it to 30 (which it was at when the graph was collected)
>> and now to 10. I am not sure that helps.
>> Any ideas? Anyone else run into this before?
>> 24 cores
>> 64GB ram
>> 4x2TB sata3 hdd
>> Running Hadoop 1.0.4, with a DataNode (2gb heap), TaskTracker (2gb
>> heap) on this machine.
>> 24 map slots (1gb heap each), no reducers.
>> Also running HBase 0.94.2 with a RS (8gb ram) on this machine.
> Marcos Luis Ortíz Valmaseda
> about.me/marcosortiz <http://about.me/marcosortiz>
> @marcosluis2186 <http://twitter.com/**marcosluis2186<http://twitter.com/marcosluis2186>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION