Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Limited cross row transactions


Copy link to this message
-
Re: Limited cross row transactions
Todd Lipcon 2012-01-19, 03:22
I find all of these ideas interesting but a little bit scope-creepy.
It used to be that regions were an implementation detail, but with
these new APIs it'd be very much an application-level construct. We
should think carefully before adding new APIs to do this - perhaps we
can start playing with the idea on a branch and see if there are some
really compelling applications?

-Todd

On Wed, Jan 18, 2012 at 7:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> Was thinking about that as well. That would be doable.
>
> Would still need to be some sort of distributed transaction (in the sense there would be a prepare/vote and commit
> phase between the participating regions),but it would all be local to a single server.
>
>
>
> ________________________________
>  From: Ted Yu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Wednesday, January 18, 2012 6:51 PM
> Subject: Re: Limited cross row transactions
>
> Still need to go over the patch, Lars.
>
> I wonder how difficult supporting cross-region transactions in the same
> region server would be.
>
> Cheers
>
> On Wed, Jan 18, 2012 at 5:02 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Filed https://issues.apache.org/jira/browse/HBASE-5229 for further
>> discussion, attached a patch that does this.
>>
>>
>> As for your point...
>> The below is one way to define limited groups of rows that can participate
>> in transactions (I should not have named it parent/child, that just
>> confuses my point).
>> Your scenario calls for global transaction (unless you have to some other
>> approach to limit the scope of rows that could participate in your FK
>> transactions to something less than the entire database).
>>
>> If every transaction is a global transaction the database will not scale.
>>
>> See http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
>> and
>> http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/
>>
>> Also check out two phase commit failure and blocking scenarios, and Paxos'
>> conditions for termination.
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Mikael Sitruk <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
>> Cc:
>> Sent: Wednesday, January 18, 2012 12:01 AM
>> Subject: Re: Limited cross row transactions
>>
>> This is for parent child relationship, but what if there is no parent child
>> relationship, but more a foreign key like relationship?
>> Using this model you do a full scan to get all the index (since you don't
>> know the parent, you just know the "secondary index").
>> Or will you use a group ID as a prefix of parent key and "child" key? In
>> this case splitting according to group may be more difficult, (due to
>> different growth of groups).
>> Doing this aren't we back in the headache of sharding in rdbms?
>>
>> Mikael.S
>>
>>
>> On Wed, Jan 18, 2012 at 7:45 AM, lars hofhansl <[EMAIL PROTECTED]>
>> wrote:
>>
>> > This thread is probably getting too long...
>> >
>> > In HBase we have to let go of "global stuff". I submit that global
>> > transactions across 1000's of nodes that can fail will never work
>> > adequately.
>> > For that kind of consistency you will be hit in availability.
>> >
>> > Like Megastore the trick is in creating a local grouping of entities that
>> > can participate in local transactions.
>> > If you limit the (consistent) index to child entities of parent entity
>> you
>> > can form your index like this:
>> > parentKey1...
>> > parentKey1.childTableName1.indexedField1
>> > parentKey1.childTableName1.indexedField2
>> > ...
>> > parentKey1.childTableName2.indexedField1
>> > parentKey1.childTableName2.indexedField2
>> > ...
>> > (assuming . cannot be in any parent key or child table name here, but you
>> > get the idea).
>> >
>> >
>> > When scanning the parent you'd have to skip the index rows with a filter.
>> > Within a parentKey you can find childKeys efficiently by scanning the
>> > index rows.

Todd Lipcon
Software Engineer, Cloudera