Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Limited cross row transactions


Copy link to this message
-
Re: Limited cross row transactions
I find all of these ideas interesting but a little bit scope-creepy.
It used to be that regions were an implementation detail, but with
these new APIs it'd be very much an application-level construct. We
should think carefully before adding new APIs to do this - perhaps we
can start playing with the idea on a branch and see if there are some
really compelling applications?

-Todd

On Wed, Jan 18, 2012 at 7:03 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> Was thinking about that as well. That would be doable.
>
> Would still need to be some sort of distributed transaction (in the sense there would be a prepare/vote and commit
> phase between the participating regions),but it would all be local to a single server.
>
>
>
> ________________________________
>  From: Ted Yu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Wednesday, January 18, 2012 6:51 PM
> Subject: Re: Limited cross row transactions
>
> Still need to go over the patch, Lars.
>
> I wonder how difficult supporting cross-region transactions in the same
> region server would be.
>
> Cheers
>
> On Wed, Jan 18, 2012 at 5:02 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Filed https://issues.apache.org/jira/browse/HBASE-5229 for further
>> discussion, attached a patch that does this.
>>
>>
>> As for your point...
>> The below is one way to define limited groups of rows that can participate
>> in transactions (I should not have named it parent/child, that just
>> confuses my point).
>> Your scenario calls for global transaction (unless you have to some other
>> approach to limit the scope of rows that could participate in your FK
>> transactions to something less than the entire database).
>>
>> If every transaction is a global transaction the database will not scale.
>>
>> See http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
>> and
>> http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/
>>
>> Also check out two phase commit failure and blocking scenarios, and Paxos'
>> conditions for termination.
>>
>> -- Lars
>>
>>
>> ----- Original Message -----
>> From: Mikael Sitruk <[EMAIL PROTECTED]>
>> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
>> Cc:
>> Sent: Wednesday, January 18, 2012 12:01 AM
>> Subject: Re: Limited cross row transactions
>>
>> This is for parent child relationship, but what if there is no parent child
>> relationship, but more a foreign key like relationship?
>> Using this model you do a full scan to get all the index (since you don't
>> know the parent, you just know the "secondary index").
>> Or will you use a group ID as a prefix of parent key and "child" key? In
>> this case splitting according to group may be more difficult, (due to
>> different growth of groups).
>> Doing this aren't we back in the headache of sharding in rdbms?
>>
>> Mikael.S
>>
>>
>> On Wed, Jan 18, 2012 at 7:45 AM, lars hofhansl <[EMAIL PROTECTED]>
>> wrote:
>>
>> > This thread is probably getting too long...
>> >
>> > In HBase we have to let go of "global stuff". I submit that global
>> > transactions across 1000's of nodes that can fail will never work
>> > adequately.
>> > For that kind of consistency you will be hit in availability.
>> >
>> > Like Megastore the trick is in creating a local grouping of entities that
>> > can participate in local transactions.
>> > If you limit the (consistent) index to child entities of parent entity
>> you
>> > can form your index like this:
>> > parentKey1...
>> > parentKey1.childTableName1.indexedField1
>> > parentKey1.childTableName1.indexedField2
>> > ...
>> > parentKey1.childTableName2.indexedField1
>> > parentKey1.childTableName2.indexedField2
>> > ...
>> > (assuming . cannot be in any parent key or child table name here, but you
>> > get the idea).
>> >
>> >
>> > When scanning the parent you'd have to skip the index rows with a filter.
>> > Within a parentKey you can find childKeys efficiently by scanning the
>> > index rows.

Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB