Zookeeper, mail # user - ZooKeeper's resident set size grows but does not shrink

Re: ZooKeeper's resident set size grows but does not shrink
Henry Robinson 2012-05-24, 01:01
Although the amount of Java heap that ZK might be using may go down, the
JVM process will still hang on to the physical memory allocated for it and
if there is no external pressure from other processes Linux will not need
to swap it to disk, hence the RSS will remain roughly constant.

That is, the amount of 'real' memory used by a JVM doesn't tell you how
much of the JVM's heap is being used. If you believe that the heap usage by
ZK is too high, because GC is not finding enough free objects to return to
the heap, then that will cause a problem because if you ever do have memory
pressure then ZK will start swapping, which is bad.

In general, processes on Linux don't usually give memory back - they use as
much as they need concurrently, and then the operating system eventually
swaps out the unused pages if it needs to.

Can you paste the output of jmap -heap <zk-pid> into a reply? That will
allow us to see how much of the heap is really being used.


On 23 May 2012 17:41, Brian Oki <[EMAIL PROTECTED]> wrote:

> Hello,
> We use ZooKeeper 3.3.3.  On a 3-node site, we've been using Patrick Hunt's
> publicly available latencies test suite to create scenarios that will help
> us to understand the memory, CPU and disk requirements for a deployment of
> ZooKeeper for our type of workload.  We use a fourth node as the ZooKeeper
> (ZK) client to conduct the tests.
> We modified zk-latencies.py slightly to just create-set-delete znodes only.
>  In particular, we create 1000 permanent znodes, each written with 250,000
> bytes of data.  We do this create-set-delete in a loop, sleeping for 5
> seconds between iterations.
> We observe at the ZK leader that the Resident Set Size (RSS) memory climbs
> rapidly to 2.6 GB on an 8 GB RAM node.  The Java heap size of each ZK
> server daemon is 3 GB.
> Further, once the test has gone through 15 iterations, all the znodes
> created on behalf of the test have been deleted.  There is no further write
> activity to ZK, and no read activity at all.  The system is quiesced.  No
> other services are competing for the disk, CPU or RAM during the test.
> Our question is this: The RSS of the ZK leader (and the followers) seems to
> remain at 2.6 GB after the test has completed.  Why?
> We would expect that since all relevant znodes for the test have been
> deleted, the leader's RSS should have shrunk considerably, even after 1
> hour has passed.  Are we missing something?
> We have used jmap to inspect the heap.  To understand the heap contents
> requires detailed implementation knowledge that we don't have, so we didn't
> pursue this avenue any further.
> Configuration:
>   3 node servers running ZK daemons as 3-server ensemble
>   1 client machine
>   each node has 8 GB RAM
>   each node has 4 cores
>   each node has a 465 GB disk
>   ZK release: 3.3.3
>   ZK server java heap size: 3 GB
>  GC: concurrent low-pause garbage collector
>   NIC: bonded 1 Gb NIC
> Thank you.
> Sincerely,
> Brian

