Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Question on Coprocessors and Atomicity


Copy link to this message
-
Re: Question on Coprocessors and Atomicity
I think the feature you are looking for is a Constraint. Currently they are
being added to 0.94 in
HBASE-4605<https://issues.apache.org/jira/browse/HBASE-4605>;
they are almost ready to be rolled in, and backporting to 0.92 is
definitely doable.

However, Constraints aren't going to be quite flexible enough to
efficiently support what you are describing. For instance, with a
constraint, you are ideally just checking the put value against some simple
constraint (never over 10 or always an integer), but looking at the current
state of the table before allowing the put would currently require creating
a full blown connection to the local table through another HTable.

In the short term, you could write a simple coprocessor to do this checking
and then move over to constraints (which are a simpler, more flexible, way
of doing this) when the necessary features have been added.

It is worth discussing if it makes sense to have access to the local region
through a constraint, though that breaks the idea a little bit, it would
certainly be useful and not overly wasteful in terms of runtime.

Supposing the feature would be added to talk to the local table, and since
the puts are going to be serialized on the regionserver (at least to that
single row you are trying to update), you will never get a situation where
the value added is over the threshold. If you were really worried about the
atomicity of the operation, then when doing the put, first get the RowLock,
then do the put and release the RowLock. However, that latter method is
going to be really slow, so should only be used as a stop gap if the
constraint doesn't work as expected, until a patch is made for constraints.

Feel free to open up a ticket and link it to 4605 for adding the local
table access functionality, and we can discuss the de/merits of adding the
access.

-Jesse

On Sat, Dec 3, 2011 at 6:24 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:

> I'm looking at the preCheckAndPut / postCheckAndPut api with
> coprocessors and I'm wondering ... are these pre/post checks done
> _after_ taking the row lock or is the row lock only done within the
> checkAndPut api.
>
> I'm interested in seeing if we can implement something like:
> (in pseudo sql)
> update table-name
> set column-name = new-value
> where (column-value - new-value) > threshold-value
>
> Basically ... I want to enhance the checkAndPut to not just compare
> "values" ... but apply an arbitrary function on the value _atomically_
> in the Put call. Multiple threads would be firing these mutations and
> I'd like the threshold-value above to never be breached under any
> circumstance.
>
> Is there a solution that can be implemented either via checkAndPut or
> using coprocessors preCheckAndPut? If not, would this be a useful
> feature to build in HBase?
>
> Thanks,
> --Suraj
>

--
-------------------
Jesse Yates
240-888-2200
@jesse_yates
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB