Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Question on Coprocessors and Atomicity


Copy link to this message
-
Re: Question on Coprocessors and Atomicity
I've been following HBASE-4605 with interest and I'm going through the
patches. I don't want to take away from all the hard work that's gone
into it ...

The more I think of it, I'm wondering how the Constraint can be
enforced without enforcing atomicity.

>From the jira description, the intention of this feature is:
"Essentially, people would implement a 'Constraint' interface for
checking keys before they are put into a table. Puts that are valid
get written to the table, but if not people can will throw an
exception that gets propagated back to the client explaining why the
put was invalid."

If the row lock is released between the time the coprocessor finishes
"preXXXX" checks and the core mutation method is invoked (as has been
discussed in this thread), how can the Constraint be ensured? If two
requests are being processed in parallel, there is every possibility
that both requests pass the "Constraint" check individually, but break
it together (e.g. even simple checks like column value == 10 would
break if two requests fire concurrently).

So - I'm questioning whether a pure Coprocessor implementation alone
would be sufficient?

I think we'll need an approach that makes the constraint checking and
mutation to be _atomically_ achieved
a) either by taking a row lock and passing that into put / checkAndPut
b) referencing & checking the constraint directly from within the put
/ checkAndPut methods (like we do with the comparator, for instance)
under a row lock.

Without being able to atomically enforce the constraint, I'm wondering
if it is misleading to future users who may create a constraint that
may fail to be enforced under heavy concurrent use.

I know a lot of work has gone into the patches ... but I thought it
better to discuss this before rolling it out to the larger community
... :)

Thanks,
--Suraj
On Fri, Dec 9, 2011 at 1:31 PM, Suraj Varma <[EMAIL PROTECTED]> wrote:
> Hi:
> I opened a jira ticket on this: https://issues.apache.org/jira/browse/HBASE-4999
>
> I have linked to HBASE-4605 in the description to show related work on
> Constraints by Jesse.
>
> Thanks!
> --Suraj
>
> On Sun, Dec 4, 2011 at 1:10 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>> Currently ConstraintProcessor latches onto prePut() to perform validation
>> check.
>>
>> From HRegion.doMiniBatchPut() where prePut() is called:
>>    /* Run coprocessor pre hook outside of locks to avoid deadlock */
>> So to make use of Constraint in Suraj's scenario, we have some decisions to
>> make about various factors.
>>
>> Cheers
>>
>> On Sun, Dec 4, 2011 at 8:39 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>>
>>> Jesse:
>>> >> Quick soln - write a CP to check the single row (blocking the put).
>>>
>>> Yeah - given that I want this to be atomically done, I'm wondering if
>>> this would even work (because, I believe I'd need to unlock the row so
>>> that the checkAndMutate can take the lock - so, there is a brief
>>> window between where there is no lock being held - and some other
>>> thread could take that lock). One option would be to pass in a lock to
>>> checkAndMutate ... but that would increase the locking period and may
>>> have performance implications, I think.
>>>
>>> I see a lot of potential in the Constraints implementation - it would
>>> really open up CAS operations to do functional constraint checking,
>>> rather than just value comparisons.
>>>
>>> --Suraj
>>>
>>> On Sun, Dec 4, 2011 at 8:32 AM, Suraj Varma <[EMAIL PROTECTED]> wrote:
>>> > Thanks - I see that the lock is taken internal to checkAndMutate.
>>> >
>>> > I'm wondering whether it is a better idea to actually pass in a
>>> > Constraint (or even Constraints) as the checkAndMutate argument. Right
>>> > now it is taking in an Comparator and a CompareOp for verification.
>>> > But, this could just be a special case of Constraint which is
>>> > evaluated within the lock.
>>> >
>>> > In other words, we could open up a richer Constraint checking api
>>> > where any "functional" Constraint check can be performed in the
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB