Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Coprocessor Increments


Copy link to this message
-
Coprocessor Increments
Hi All,
   We have been running into an RPC deadlock issue on HBase and from
investigation, we believe the root of the issue is in us doing cross
region increments from a coprocessor. After some further searching and
reading over this
<http://mail-archives.apache.org/mod_mbox/hbase-user/201212.mbox/%3CCA+RK=_BP8k1Z-gQ+38RiipKgzi+=5Cn3EkZDJZ_Z-2QT8xOZ+[EMAIL PROTECTED]%3E>
we think that we can solve this by doing the increments locally on the
region. My question, is what happens if the row value specified does not
land in the current region. We can obviously do our best to make sure
that it does, but is there any way to be absolutely sure that it is?
This is supposing we use incrementColumnValue() out of the HRegion class
(
http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#incrementColumnValue(byte[],
byte[], byte[], long, boolean)
<http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/regionserver/HRegion.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long,%20boolean%29>)
Here is the method signature for simplicity

public long*incrementColumnValue*(byte[] row,
                                  byte[] family,
                                  byte[] qualifier,
                                  long amount,
                                  boolean writeToWAL)

If we specify a row, there seems to be no guarantee that row will be
confined to the region the coprocessor is on.

If the call does force the increment to be on the same region, what will
happen if a later call ends up on another region but with the same name.

Contrived Example

Insert rowkey "California-12345" triggers a coprocessor to call
incrementColumnValue() with a rowkey of "California-total"  all on Region 1.

This would likely be on an insert on the same region. But as the table
grows, this secondary insert could end up on another region. If it is
confined, then suppose we later insert "California-95424" which still
triggers a call to incrementColumnValue() with a rowkey of
"California-total" all on Region 2.

Are we now left with two rowkeys of "California-total"? One on each
region server? If so, what happens if these two regions are compacted
into one?

Hopefully this all makes sense. We are on Hbase 0.94.10. If we are going
about this all wrong, that could be the issue as well :)

Thanks.

  -John Weatherford