HBase version: 0.90.3 + Patches
Hadoop version: CDH3u0
Relevant Jiras: https://issues.apache.org/jira/browse/HBASE-2937,
We have been using the 'hbase.client.operation.timeout' knob
introduced in 2937 for quite some time now. It helps us enforce SLA.
We have two HBase clusters and two HBase client clusters. One of them
is much busier than the other.
We have seen a deterministic behavior of clients running in busy
cluster. Their (client's) memory footprint increases consistently
after they have been up for roughly 24 hours.
This memory footprint almost doubles from its usual value (usual case
== RPC timeout disabled). After much investigation nothing concrete
came out and we had to put a hack
which keep heap size in control even when RPC timeout is enabled. Also
please note , the same behavior is not observed in 'not so busy
The patch is here : https://gist.github.com/1288023
Can some one, who is also running RPC timeout in production under fair
load, please share the experience.