I have been consistently hitting the following error in one of my QA
clusters. I came across two JIRAs, the first one (HBASE-3466) was closed
saying "Cannot Reproduce" but a new one was re-opened under HBASE-5285.
I am using HBase 0.94.4 and Hadoop 1.0.4
24 region servers (8 cores, 8GB RAM)
In HBASE-5285, Ted Yu has commented that it could be due to a hash code
collision. But if caching is enabled, wouldn't it return the block with
which it's hash collides when we check the cache for block existence ? It
should not even hit the code that tries to put into cache method unless and
until there is some concurrency issue.
Also HBASE-5285 states that it occurred during compaction for the reporter,
but in my cluster I have disabled compaction, so this error happens with
not just compaction.
Let me know if you need any more information. I can volunteer to submit a
patch if we can find the root cause.
ramkrishna vasudevan 2013-05-06, 05:45
Viral Bajaria 2013-05-06, 05:49
Jean-Daniel Cryans 2013-05-06, 17:29