Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Question on Coprocessors and Atomicity

Copy link to this message
Re: Question on Coprocessors and Atomicity
Thanks - I see that the lock is taken internal to checkAndMutate.

I'm wondering whether it is a better idea to actually pass in a
Constraint (or even Constraints) as the checkAndMutate argument. Right
now it is taking in an Comparator and a CompareOp for verification.
But, this could just be a special case of Constraint which is
evaluated within the lock.

In other words, we could open up a richer Constraint checking api
where any "functional" Constraint check can be performed in the
checkAndPut operation.

This would also not have the same performance impact of taking a
rowLock in preCheckAndPut and release in postCheckAndPut. And - it is
really (in my mind) implementing the compare-and-set more generically.

I also see the potential of passing in multiple constraints (say
upper/lower bounds in Increment/Decrement operations) etc.

On Sat, Dec 3, 2011 at 7:44 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> From HRegionServer.checkAndPut():
>    if (region.getCoprocessorHost() != null) {
>      Boolean result = region.getCoprocessorHost()
>        .preCheckAndPut(row, family, qualifier, CompareOp.EQUAL, comparator,
>          put);
> ...
>    boolean result = checkAndMutate(regionName, row, family, qualifier,
>      CompareOp.EQUAL, new BinaryComparator(value), put,
>      lock);
> We can see that the lock isn't taken for preCheckAndPut().
> To satisfy Suraj's requirement, I think a slight change to checkAndPut() is
> needed so that atomicity can be achieved across preCheckAndPut() and
> checkAndMutate().
> Cheers
> On Sat, Dec 3, 2011 at 4:54 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>> Just so my question is clear ... everything I'm suggesting is in the
>> context of a single row (not cross row / table). - so, yes, I'm
>> guessing obtaining a RowLock on the region side during preCheckAndPut
>> / postCheckAndPut would certainly work. Which was why I was asking
>> whether the pre/postCheckAndPut obtains the row lock or whether the
>> row lock is only obtained within checkAndPut.
>> Let's say the coprocessor takes a rowlock in preCheckAndPut ... will
>> that even work? i.e. can the same rowlock be inherited by the
>> checkAndPut api within that thread's context? Or will preCheckAndPut
>> have to release the lock so that checkAndPut can take it (which won't
>> work for my case, as it has to be atomic between the preCheck and
>> Put.)
>> Thanks for pointing me to the Constraints functionality - I'll take a
>> look at whether it could potentially work.
>> --Suraj
>> On Sat, Dec 3, 2011 at 10:25 AM, Jesse Yates <[EMAIL PROTECTED]>
>> wrote:
>> > I think the feature you are looking for is a Constraint. Currently they
>> are
>> > being added to 0.94 in
>> > HBASE-4605<https://issues.apache.org/jira/browse/HBASE-4605>;
>> > they are almost ready to be rolled in, and backporting to 0.92 is
>> > definitely doable.
>> >
>> > However, Constraints aren't going to be quite flexible enough to
>> > efficiently support what you are describing. For instance, with a
>> > constraint, you are ideally just checking the put value against some
>> simple
>> > constraint (never over 10 or always an integer), but looking at the
>> current
>> > state of the table before allowing the put would currently require
>> creating
>> > a full blown connection to the local table through another HTable.
>> >
>> > In the short term, you could write a simple coprocessor to do this
>> checking
>> > and then move over to constraints (which are a simpler, more flexible,
>> way
>> > of doing this) when the necessary features have been added.
>> >
>> > It is worth discussing if it makes sense to have access to the local
>> region
>> > through a constraint, though that breaks the idea a little bit, it would
>> > certainly be useful and not overly wasteful in terms of runtime.
>> >
>> > Supposing the feature would be added to talk to the local table, and
>> since
>> > the puts are going to be serialized on the regionserver (at least to that
>> > single row you are trying to update), you will never get a situation