Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Question on Coprocessors and Atomicity


Copy link to this message
-
Re: Question on Coprocessors and Atomicity
Thanks - I see that the lock is taken internal to checkAndMutate.

I'm wondering whether it is a better idea to actually pass in a
Constraint (or even Constraints) as the checkAndMutate argument. Right
now it is taking in an Comparator and a CompareOp for verification.
But, this could just be a special case of Constraint which is
evaluated within the lock.

In other words, we could open up a richer Constraint checking api
where any "functional" Constraint check can be performed in the
checkAndPut operation.

This would also not have the same performance impact of taking a
rowLock in preCheckAndPut and release in postCheckAndPut. And - it is
really (in my mind) implementing the compare-and-set more generically.

I also see the potential of passing in multiple constraints (say
upper/lower bounds in Increment/Decrement operations) etc.

--Suraj
On Sat, Dec 3, 2011 at 7:44 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> From HRegionServer.checkAndPut():
>    if (region.getCoprocessorHost() != null) {
>      Boolean result = region.getCoprocessorHost()
>        .preCheckAndPut(row, family, qualifier, CompareOp.EQUAL, comparator,
>          put);
> ...
>    boolean result = checkAndMutate(regionName, row, family, qualifier,
>      CompareOp.EQUAL, new BinaryComparator(value), put,
>      lock);
> We can see that the lock isn't taken for preCheckAndPut().
>
> To satisfy Suraj's requirement, I think a slight change to checkAndPut() is
> needed so that atomicity can be achieved across preCheckAndPut() and
> checkAndMutate().
>
> Cheers
>
> On Sat, Dec 3, 2011 at 4:54 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>
>> Just so my question is clear ... everything I'm suggesting is in the
>> context of a single row (not cross row / table). - so, yes, I'm
>> guessing obtaining a RowLock on the region side during preCheckAndPut
>> / postCheckAndPut would certainly work. Which was why I was asking
>> whether the pre/postCheckAndPut obtains the row lock or whether the
>> row lock is only obtained within checkAndPut.
>>
>> Let's say the coprocessor takes a rowlock in preCheckAndPut ... will
>> that even work? i.e. can the same rowlock be inherited by the
>> checkAndPut api within that thread's context? Or will preCheckAndPut
>> have to release the lock so that checkAndPut can take it (which won't
>> work for my case, as it has to be atomic between the preCheck and
>> Put.)
>>
>> Thanks for pointing me to the Constraints functionality - I'll take a
>> look at whether it could potentially work.
>> --Suraj
>>
>> On Sat, Dec 3, 2011 at 10:25 AM, Jesse Yates <[EMAIL PROTECTED]>
>> wrote:
>> > I think the feature you are looking for is a Constraint. Currently they
>> are
>> > being added to 0.94 in
>> > HBASE-4605<https://issues.apache.org/jira/browse/HBASE-4605>;
>> > they are almost ready to be rolled in, and backporting to 0.92 is
>> > definitely doable.
>> >
>> > However, Constraints aren't going to be quite flexible enough to
>> > efficiently support what you are describing. For instance, with a
>> > constraint, you are ideally just checking the put value against some
>> simple
>> > constraint (never over 10 or always an integer), but looking at the
>> current
>> > state of the table before allowing the put would currently require
>> creating
>> > a full blown connection to the local table through another HTable.
>> >
>> > In the short term, you could write a simple coprocessor to do this
>> checking
>> > and then move over to constraints (which are a simpler, more flexible,
>> way
>> > of doing this) when the necessary features have been added.
>> >
>> > It is worth discussing if it makes sense to have access to the local
>> region
>> > through a constraint, though that breaks the idea a little bit, it would
>> > certainly be useful and not overly wasteful in terms of runtime.
>> >
>> > Supposing the feature would be added to talk to the local table, and
>> since
>> > the puts are going to be serialized on the regionserver (at least to that
>> > single row you are trying to update), you will never get a situation
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB