|
Ramkrishna.S.Vasudevan
2012-08-28, 07:24
Jesse Yates
2012-08-28, 07:59
Ramkrishna.S.Vasudevan
2012-08-28, 08:51
Wei Tan
2012-08-28, 15:52
Ted Yu
2012-08-28, 16:03
Jesse Yates
2012-08-28, 17:03
Ted Yu
2012-08-28, 17:34
Ramkrishna.S.Vasudevan
2012-08-29, 04:18
Jonathan Hsieh
2012-08-29, 13:47
Ted Yu
2012-08-29, 14:15
Ramkrishna.S.Vasudevan
2012-08-29, 15:12
Jonathan Hsieh
2012-08-29, 16:11
Ted Yu
2012-08-29, 16:19
Jesse Yates
2012-08-29, 17:03
Ted Yu
2012-08-29, 17:07
Jonathan Hsieh
2012-08-29, 17:46
Jonathan Hsieh
2012-08-29, 18:18
Stack
2012-08-29, 22:32
Ramkrishna.S.Vasudevan
2012-08-30, 04:18
Ramkrishna.S.Vasudevan
2012-08-30, 04:34
|
-
A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-28, 07:24
Hi All
When we try to build any type of secondary indices for a given table how can one handle maxVersions in the secondary index tables. For eg, I have inserted Row1 - Val1 => t Row1 - Val2 => t+1 Row1 - Val3. => t+2 Ideally if my max versions is only one then Val3 should be my result If I query on main table for row1. Now in my index I will be having all the above 3 entries. Now how can we remove the older entries from the index table that does not fit into maxVersions. Currently while scanning and the code that avoids the max Versions does not give any hooks to know the entries skipped thro versions. So any suggestions on this, I am still seeing the code for any other options but suggestions welcome. Regards Ram
-
Re: A general question on maxVersion handling when we have Secondary index tablesJesse Yates 2012-08-28, 07:59
Ram,
If I understand correctly, I think you can design your index such that you don't actually use the timestamp (e.g. everything gets put with a TS = 10 - or some other non-special, relatively small number that's not 0 as I'd worry about that in HBase ;) Then when you set maxVersions to 1, everything should be good. You get a couple of wasted bytes from the TS, but with the prefixTrie stuff that should be pretty minimal overhead. If you do need to keep track of the timestamp you should be able to munge that back up into the column qualifier (and just know that that last 64 bits is the timestamp). Again a little more CPU cost, but its really not that big of an overhead. It seems like you don't really care about the TS though, in which case this should be pretty simple. Out of curiosity, what are people using for their secondary indexing solutions? I know there are a bunch out there, but don't know what people have adopted, what they like/dislike, design tradeoffs made and why. Disclaimer: I recently proposed a secondary indexing solution myself (shameless self-plug: http://jyates.github.com/2012/07/09/consistent-enough-secondary-indexes.html) and its something I'm working on for Salesforce - open sourced at some point, promise! -Jesse ------------------- Jesse Yates @jesse_yates jyates.github.com On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > Hi All > > > > When we try to build any type of secondary indices for a given table how > can > one handle maxVersions in the secondary index tables. > > > > For eg, > > I have inserted > > Row1 - Val1 => t > > Row1 - Val2 => t+1 > > Row1 - Val3. => t+2 > > > > Ideally if my max versions is only one then Val3 should be my result If I > query on main table for row1. > > > > Now in my index I will be having all the above 3 entries. Now how can we > remove the older entries from the index table that does not fit into > maxVersions. > > > > Currently while scanning and the code that avoids the max Versions does not > give any hooks to know the entries skipped thro versions. > > So any suggestions on this, I am still seeing the code for any other > options > but suggestions welcome. > > > > Regards > > Ram > >
-
RE: A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-28, 08:51
Hi Jesse
Thanks lot for your reply. -> Not maintaining timestamps in the sec index may cause problems when I issue an delete on the main table and the corresponding things needs to be deleted in the sec index. -> As in the case I mentioned below Index table will have Val1_row1 (t) Val2_row1 (t+2) Val3_row1 (t+3) Now my query says get me all the values greater than Val1 ideally only Val3 should be fetched. But may be a direct scan on index table will not know he should give me Val3_row3 alone. Unless I know the number of existing entries I will not be able to take a call as which one should be avoided and which one to be considered. Any way on the main table for row1 only Val3 will be retrieved. -> If I have a usecase like I will try to remove the older versions during compaction of the index table how can we do it? Having all the older versions also may lead to increase in the no of files and they may be compacted. But if I want to remove such olderversions during compaction what can be the ways we can handle. These are some problems that come to my mind while we want to impl this. Jesse, am I missing something here. The prefixTrie stuff comes when we are bothered about storage, yes using the prefixTrie stuff will help in storage. And talking about the usage of sec index may be I cannot comment on that now. Regards Ram > -----Original Message----- > From: Jesse Yates [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 28, 2012 1:30 PM > To: [EMAIL PROTECTED] > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > Ram, > > If I understand correctly, I think you can design your index such that > you > don't actually use the timestamp (e.g. everything gets put with a TS > 10 - > or some other non-special, relatively small number that's not 0 as I'd > worry about that in HBase ;) Then when you set maxVersions to 1, > everything > should be good. > > You get a couple of wasted bytes from the TS, but with the prefixTrie > stuff > that should be pretty minimal overhead. If you do need to keep track of > the > timestamp you should be able to munge that back up into the column > qualifier (and just know that that last 64 bits is the timestamp). > Again a > little more CPU cost, but its really not that big of an overhead. It > seems > like you don't really care about the TS though, in which case this > should > be pretty simple. > > Out of curiosity, what are people using for their secondary indexing > solutions? I know there are a bunch out there, but don't know what > people > have adopted, what they like/dislike, design tradeoffs made and why. > > Disclaimer: I recently proposed a secondary indexing solution myself > (shameless self-plug: > http://jyates.github.com/2012/07/09/consistent-enough-secondary- > indexes.html) > and its something I'm working on for Salesforce - open sourced at some > point, promise! > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > Hi All > > > > > > > > When we try to build any type of secondary indices for a given table > how > > can > > one handle maxVersions in the secondary index tables. > > > > > > > > For eg, > > > > I have inserted > > > > Row1 - Val1 => t > > > > Row1 - Val2 => t+1 > > > > Row1 - Val3. => t+2 > > > > > > > > Ideally if my max versions is only one then Val3 should be my result > If I > > query on main table for row1. > > > > > > > > Now in my index I will be having all the above 3 entries. Now how > can we > > remove the older entries from the index table that does not fit into > > maxVersions. > > > > > > > > Currently while scanning and the code that avoids the max Versions > does not > > give any hooks to know the entries skipped thro versions. > > > > So any suggestions on this, I am still seeing the code for any other > > options > > but suggestions welcome.
-
Re: A general question on maxVersion handling when we have Secondary index tablesWei Tan 2012-08-28, 15:52
Thanks for sharing a pointer to your implementation.
My two cents: timestamp is a way to do MVCC and setting every KV with the same TS will get concurrency control very tricky and error prone, if not impossible I think Ram is talking about the dead entry in the index table rather than data table. Deleting old index entries upfront when there is a new put might be a choice. Best Regards, Wei Wei Tan Research Staff Member IBM T. J. Watson Research Center 19 Skyline Dr, Hawthorne, NY 10532 [EMAIL PROTECTED]; 914-784-6752 From: Jesse Yates <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], Date: 08/28/2012 04:00 AM Subject: Re: A general question on maxVersion handling when we have Secondary index tables Ram, If I understand correctly, I think you can design your index such that you don't actually use the timestamp (e.g. everything gets put with a TS = 10 - or some other non-special, relatively small number that's not 0 as I'd worry about that in HBase ;) Then when you set maxVersions to 1, everything should be good. You get a couple of wasted bytes from the TS, but with the prefixTrie stuff that should be pretty minimal overhead. If you do need to keep track of the timestamp you should be able to munge that back up into the column qualifier (and just know that that last 64 bits is the timestamp). Again a little more CPU cost, but its really not that big of an overhead. It seems like you don't really care about the TS though, in which case this should be pretty simple. Out of curiosity, what are people using for their secondary indexing solutions? I know there are a bunch out there, but don't know what people have adopted, what they like/dislike, design tradeoffs made and why. Disclaimer: I recently proposed a secondary indexing solution myself (shameless self-plug: http://jyates.github.com/2012/07/09/consistent-enough-secondary-indexes.html ) and its something I'm working on for Salesforce - open sourced at some point, promise! -Jesse ------------------- Jesse Yates @jesse_yates jyates.github.com On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > Hi All > > > > When we try to build any type of secondary indices for a given table how > can > one handle maxVersions in the secondary index tables. > > > > For eg, > > I have inserted > > Row1 - Val1 => t > > Row1 - Val2 => t+1 > > Row1 - Val3. => t+2 > > > > Ideally if my max versions is only one then Val3 should be my result If I > query on main table for row1. > > > > Now in my index I will be having all the above 3 entries. Now how can we > remove the older entries from the index table that does not fit into > maxVersions. > > > > Currently while scanning and the code that avoids the max Versions does not > give any hooks to know the entries skipped thro versions. > > So any suggestions on this, I am still seeing the code for any other > options > but suggestions welcome. > > > > Regards > > Ram > >
-
Re: A general question on maxVersion handling when we have Secondary index tablesTed Yu 2012-08-28, 16:03
I think this discussion should be on HBASE JIRA.
Another dimension to secondary indexing is the co-location (or pairing) of data table region and index table region. Related regions from the two tables should be placed on the same region server. Cheers On Tue, Aug 28, 2012 at 8:52 AM, Wei Tan <[EMAIL PROTECTED]> wrote: > Thanks for sharing a pointer to your implementation. > My two cents: > timestamp is a way to do MVCC and setting every KV with the same TS will > get concurrency control very tricky and error prone, if not impossible > I think Ram is talking about the dead entry in the index table rather than > data table. Deleting old index entries upfront when there is a new put > might be a choice. > > > Best Regards, > Wei > > Wei Tan > Research Staff Member > IBM T. J. Watson Research Center > 19 Skyline Dr, Hawthorne, NY 10532 > [EMAIL PROTECTED]; 914-784-6752 > > > > From: Jesse Yates <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED], > Date: 08/28/2012 04:00 AM > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > > > Ram, > > If I understand correctly, I think you can design your index such that you > don't actually use the timestamp (e.g. everything gets put with a TS = 10 > - > or some other non-special, relatively small number that's not 0 as I'd > worry about that in HBase ;) Then when you set maxVersions to 1, > everything > should be good. > > You get a couple of wasted bytes from the TS, but with the prefixTrie > stuff > that should be pretty minimal overhead. If you do need to keep track of > the > timestamp you should be able to munge that back up into the column > qualifier (and just know that that last 64 bits is the timestamp). Again a > little more CPU cost, but its really not that big of an overhead. It seems > like you don't really care about the TS though, in which case this should > be pretty simple. > > Out of curiosity, what are people using for their secondary indexing > solutions? I know there are a bunch out there, but don't know what people > have adopted, what they like/dislike, design tradeoffs made and why. > > Disclaimer: I recently proposed a secondary indexing solution myself > (shameless self-plug: > > http://jyates.github.com/2012/07/09/consistent-enough-secondary-indexes.html > ) > and its something I'm working on for Salesforce - open sourced at some > point, promise! > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > Hi All > > > > > > > > When we try to build any type of secondary indices for a given table how > > can > > one handle maxVersions in the secondary index tables. > > > > > > > > For eg, > > > > I have inserted > > > > Row1 - Val1 => t > > > > Row1 - Val2 => t+1 > > > > Row1 - Val3. => t+2 > > > > > > > > Ideally if my max versions is only one then Val3 should be my result If > I > > query on main table for row1. > > > > > > > > Now in my index I will be having all the above 3 entries. Now how can > we > > remove the older entries from the index table that does not fit into > > maxVersions. > > > > > > > > Currently while scanning and the code that avoids the max Versions does > not > > give any hooks to know the entries skipped thro versions. > > > > So any suggestions on this, I am still seeing the code for any other > > options > > but suggestions welcome. > > > > > > > > Regards > > > > Ram > > > > > >
-
Re: A general question on maxVersion handling when we have Secondary index tablesJesse Yates 2012-08-28, 17:03
@Ted: Are you proposing re-opening the should we have secondary indexes in
HBase discussion? If so, I'm +1 on adding them. Wanna file a jira? @Wei Tan: Yeah, I generally agree. However, I think you can get away with ignoring MVCC and just keep an index on the latest key (where key _includes_ the timestamp) and then do lazy cleanup. @Ram: if you move the TS into the CQ you can remove the actual TS (so it costs you some minor computational overhead to pull it out), still giving you the right answer without actually using HBase timestamps. I've proposed that you can just do an async cleanup of the index when you find out its stale, with minimal overhead to the clients. Otherwise, yes, you would need a way to tie together the versions in the index and primary tables, which you don't always want to keep exactly the same. Also, there is an issue when returning the version of the row based on the indexed TS. Should you return the whole row? Should you return just the parts of the row with timestamps the same age or older? For the latter, how you do know which parts of the row to return when you have two versions of the same column that was indexed (which other row elements should be include based on TS)? I'd propose all questions that need to be answered if we are going to do a general hbase index. ------------------- Jesse Yates @jesse_yates jyates.github.com On Tue, Aug 28, 2012 at 9:03 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > I think this discussion should be on HBASE JIRA. > > Another dimension to secondary indexing is the co-location (or pairing) of > data table region and index table region. Related regions from the two > tables should be placed on the same region server. > > Cheers > > On Tue, Aug 28, 2012 at 8:52 AM, Wei Tan <[EMAIL PROTECTED]> wrote: > > > Thanks for sharing a pointer to your implementation. > > My two cents: > > timestamp is a way to do MVCC and setting every KV with the same TS will > > get concurrency control very tricky and error prone, if not impossible > > I think Ram is talking about the dead entry in the index table rather > than > > data table. Deleting old index entries upfront when there is a new put > > might be a choice. > > > > > > Best Regards, > > Wei > > > > Wei Tan > > Research Staff Member > > IBM T. J. Watson Research Center > > 19 Skyline Dr, Hawthorne, NY 10532 > > [EMAIL PROTECTED]; 914-784-6752 > > > > > > > > From: Jesse Yates <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED], > > Date: 08/28/2012 04:00 AM > > Subject: Re: A general question on maxVersion handling when we > have > > Secondary index tables > > > > > > > > Ram, > > > > If I understand correctly, I think you can design your index such that > you > > don't actually use the timestamp (e.g. everything gets put with a TS = 10 > > - > > or some other non-special, relatively small number that's not 0 as I'd > > worry about that in HBase ;) Then when you set maxVersions to 1, > > everything > > should be good. > > > > You get a couple of wasted bytes from the TS, but with the prefixTrie > > stuff > > that should be pretty minimal overhead. If you do need to keep track of > > the > > timestamp you should be able to munge that back up into the column > > qualifier (and just know that that last 64 bits is the timestamp). Again > a > > little more CPU cost, but its really not that big of an overhead. It > seems > > like you don't really care about the TS though, in which case this should > > be pretty simple. > > > > Out of curiosity, what are people using for their secondary indexing > > solutions? I know there are a bunch out there, but don't know what people > > have adopted, what they like/dislike, design tradeoffs made and why. > > > > Disclaimer: I recently proposed a secondary indexing solution myself > > (shameless self-plug: > > > > > http://jyates.github.com/2012/07/09/consistent-enough-secondary-indexes.html > > ) > > and its something I'm working on for Salesforce - open sourced at some > > point, promise!
-
Re: A general question on maxVersion handling when we have Secondary index tablesTed Yu 2012-08-28, 17:34
I think we should revive secondary indexes discussion (actually it has been
revived) Since Ramkrishna has design in mind, he would be the best person to log a new JIRA. Cheers On Tue, Aug 28, 2012 at 10:03 AM, Jesse Yates <[EMAIL PROTECTED]>wrote: > @Ted: Are you proposing re-opening the should we have secondary indexes in > HBase discussion? If so, I'm +1 on adding them. Wanna file a jira? > > @Wei Tan: Yeah, I generally agree. However, I think you can get away with > ignoring MVCC and just keep an index on the latest key (where key > _includes_ the timestamp) and then do lazy cleanup. > > @Ram: if you move the TS into the CQ you can remove the actual TS (so it > costs you some minor computational overhead to pull it out), still giving > you the right answer without actually using HBase timestamps. > > I've proposed that you can just do an async cleanup of the index when you > find out its stale, with minimal overhead to the clients. Otherwise, yes, > you would need a way to tie together the versions in the index and primary > tables, which you don't always want to keep exactly the same. > > Also, there is an issue when returning the version of the row based on the > indexed TS. Should you return the whole row? Should you return just the > parts of the row with timestamps the same age or older? For the latter, how > you do know which parts of the row to return when you have two versions of > the same column that was indexed (which other row elements should be > include based on TS)? I'd propose all questions that need to be answered if > we are going to do a general hbase index. > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Tue, Aug 28, 2012 at 9:03 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > I think this discussion should be on HBASE JIRA. > > > > Another dimension to secondary indexing is the co-location (or pairing) > of > > data table region and index table region. Related regions from the two > > tables should be placed on the same region server. > > > > Cheers > > > > On Tue, Aug 28, 2012 at 8:52 AM, Wei Tan <[EMAIL PROTECTED]> wrote: > > > > > Thanks for sharing a pointer to your implementation. > > > My two cents: > > > timestamp is a way to do MVCC and setting every KV with the same TS > will > > > get concurrency control very tricky and error prone, if not impossible > > > I think Ram is talking about the dead entry in the index table rather > > than > > > data table. Deleting old index entries upfront when there is a new put > > > might be a choice. > > > > > > > > > Best Regards, > > > Wei > > > > > > Wei Tan > > > Research Staff Member > > > IBM T. J. Watson Research Center > > > 19 Skyline Dr, Hawthorne, NY 10532 > > > [EMAIL PROTECTED]; 914-784-6752 > > > > > > > > > > > > From: Jesse Yates <[EMAIL PROTECTED]> > > > To: [EMAIL PROTECTED], > > > Date: 08/28/2012 04:00 AM > > > Subject: Re: A general question on maxVersion handling when we > > have > > > Secondary index tables > > > > > > > > > > > > Ram, > > > > > > If I understand correctly, I think you can design your index such that > > you > > > don't actually use the timestamp (e.g. everything gets put with a TS > 10 > > > - > > > or some other non-special, relatively small number that's not 0 as I'd > > > worry about that in HBase ;) Then when you set maxVersions to 1, > > > everything > > > should be good. > > > > > > You get a couple of wasted bytes from the TS, but with the prefixTrie > > > stuff > > > that should be pretty minimal overhead. If you do need to keep track of > > > the > > > timestamp you should be able to munge that back up into the column > > > qualifier (and just know that that last 64 bits is the timestamp). > Again > > a > > > little more CPU cost, but its really not that big of an overhead. It > > seems > > > like you don't really care about the TS though, in which case this > should > > > be pretty simple. > > > > > > Out of curiosity, what are people using for their secondary indexing
-
RE: A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-29, 04:18
Hi
Yes I was talking about the dead entry in the index table rather than the actual data table. Regards Ram > -----Original Message----- > From: Wei Tan [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 28, 2012 9:22 PM > To: [EMAIL PROTECTED] > Cc: Sandeep Tata > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > Thanks for sharing a pointer to your implementation. > My two cents: > timestamp is a way to do MVCC and setting every KV with the same TS > will > get concurrency control very tricky and error prone, if not impossible > I think Ram is talking about the dead entry in the index table rather > than > data table. Deleting old index entries upfront when there is a new put > might be a choice. > > > Best Regards, > Wei > > Wei Tan > Research Staff Member > IBM T. J. Watson Research Center > 19 Skyline Dr, Hawthorne, NY 10532 > [EMAIL PROTECTED]; 914-784-6752 > > > > From: Jesse Yates <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED], > Date: 08/28/2012 04:00 AM > Subject: Re: A general question on maxVersion handling when we > have > Secondary index tables > > > > Ram, > > If I understand correctly, I think you can design your index such that > you > don't actually use the timestamp (e.g. everything gets put with a TS > 10 > - > or some other non-special, relatively small number that's not 0 as I'd > worry about that in HBase ;) Then when you set maxVersions to 1, > everything > should be good. > > You get a couple of wasted bytes from the TS, but with the prefixTrie > stuff > that should be pretty minimal overhead. If you do need to keep track of > the > timestamp you should be able to munge that back up into the column > qualifier (and just know that that last 64 bits is the timestamp). > Again a > little more CPU cost, but its really not that big of an overhead. It > seems > like you don't really care about the TS though, in which case this > should > be pretty simple. > > Out of curiosity, what are people using for their secondary indexing > solutions? I know there are a bunch out there, but don't know what > people > have adopted, what they like/dislike, design tradeoffs made and why. > > Disclaimer: I recently proposed a secondary indexing solution myself > (shameless self-plug: > http://jyates.github.com/2012/07/09/consistent-enough-secondary- > indexes.html > ) > and its something I'm working on for Salesforce - open sourced at some > point, promise! > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Tue, Aug 28, 2012 at 12:24 AM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > Hi All > > > > > > > > When we try to build any type of secondary indices for a given table > how > > can > > one handle maxVersions in the secondary index tables. > > > > > > > > For eg, > > > > I have inserted > > > > Row1 - Val1 => t > > > > Row1 - Val2 => t+1 > > > > Row1 - Val3. => t+2 > > > > > > > > Ideally if my max versions is only one then Val3 should be my result > If > I > > query on main table for row1. > > > > > > > > Now in my index I will be having all the above 3 entries. Now how > can > we > > remove the older entries from the index table that does not fit into > > maxVersions. > > > > > > > > Currently while scanning and the code that avoids the max Versions > does > not > > give any hooks to know the entries skipped thro versions. > > > > So any suggestions on this, I am still seeing the code for any other > > options > > but suggestions welcome. > > > > > > > > Regards > > > > Ram > > > >
-
Re: A general question on maxVersion handling when we have Secondary index tablesJonathan Hsieh 2012-08-29, 13:47
I'm more of a fan of having secondary indexes added as an external feature
(coproc or new client library on top of our current client library) and focusing on only adding apis necessary to make 2ndary indexes possible and correct on/in HBase. There are many different use patterns and requirements and one style of secondary index will not be good for everything. Do we only care about this working well for highly selectivity keys? What are possible indexes (col name, value, value prefix, everything our filters support?) Do we care more about writes or reads, ACID correctness or speed, etc? Also, there are several questions about how we handle other features in conjunction with 2ndary indexes: replication, bulk load, snapshots, to name a few. Maybe it makes sense to spend some time defining what we want to index secondarily and what a user api to this external api would be. Then we could have the different implementations under-the-covers, and allow for users to swap implementations for the tradeoffs that fit their use cases. It wouldn't be free to change but hopefully "easy" from a user point of view. Personally, I've tend to favor more of a percolator-style implementation -- it is a client library and built on top of hbase. This approach seems to be more "HBase-style" with it's emphasis consistency and atomicity, and seems to require only a few mondifications to HBase core. Sure it likely slower than my read of Jesse's proposal, but it seems always always consistent and thus predictable in cases where there are failures on deletes and updates. We'd need HBase API primitives like checkAndMutate call (check with multiple delete/put on the same row), and possibly an atomic multitable bulkload. I'm not sure that it is replication compatible, and there are probably questions we'll need to answer once snapshots solidifies. Ted's idea of colocating regions (like the index table's regions) definitely feels like a primitive (pluggable, likely-per-table region assignment plans) that we could add to HBase core. This requirement though for 2ndary indexes seems to imply an approach similar to cassandra's approach -- having a local index of each region on region server and colocating them. Is this right? If so, this is essentially a filtering optimization -- it would mean that a query based on secondary index would potentially have to hit every region server that has a region in the primary table. This is great approach if the index lookup has high cardinality but if the secondary index is highly selective, you'd have to march through a bunch or RS's before getting an answer. Jon. On Tue, Aug 28, 2012 at 9:18 PM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > Hi > > Yes I was talking about the dead entry in the index table rather than the > actual data table. > > Regards > Ram > > > -----Original Message----- > > From: Wei Tan [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, August 28, 2012 9:22 PM > > To: [EMAIL PROTECTED] > > Cc: Sandeep Tata > > Subject: Re: A general question on maxVersion handling when we have > > Secondary index tables > > > > Thanks for sharing a pointer to your implementation. > > My two cents: > > timestamp is a way to do MVCC and setting every KV with the same TS > > will > > get concurrency control very tricky and error prone, if not impossible > > I think Ram is talking about the dead entry in the index table rather > > than > > data table. Deleting old index entries upfront when there is a new put > > might be a choice. > > > > > > Best Regards, > > Wei > > > > Wei Tan > > Research Staff Member > > IBM T. J. Watson Research Center > > 19 Skyline Dr, Hawthorne, NY 10532 > > [EMAIL PROTECTED]; 914-784-6752 > > > > > > > > From: Jesse Yates <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED], > > Date: 08/28/2012 04:00 AM > > Subject: Re: A general question on maxVersion handling when we > > have > > Secondary index tables > > > > > > > > Ram, > > > > If I understand correctly, I think you can design your index such that // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED]
-
Re: A general question on maxVersion handling when we have Secondary index tablesTed Yu 2012-08-29, 14:15
Thanks for the detailed response, Jon.
bq. it would mean that a query based on secondary index would potentially have to hit every region server that has a region in the primary table. Can you elaborate on the above a little bit ? Is this because secondary index would point us to more than one region in the data table because several versions are saved for the same row ? My thinking was to ease management of simultaneous (data and index) region split through region colocation. Cheers On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote: > I'm more of a fan of having secondary indexes added as an external feature > (coproc or new client library on top of our current client library) and > focusing on only adding apis necessary to make 2ndary indexes possible and > correct on/in HBase. There are many different use patterns and > requirements and one style of secondary index will not be good for > everything. Do we only care about this working well for highly selectivity > keys? What are possible indexes (col name, value, value prefix, everything > our filters support?) Do we care more about writes or reads, ACID > correctness or speed, etc? Also, there are several questions about how we > handle other features in conjunction with 2ndary indexes: replication, bulk > load, snapshots, to name a few. > > Maybe it makes sense to spend some time defining what we want to index > secondarily and what a user api to this external api would be. Then we > could have the different implementations under-the-covers, and allow for > users to swap implementations for the tradeoffs that fit their use cases. > It wouldn't be free to change but hopefully "easy" from a user point of > view. > > Personally, I've tend to favor more of a percolator-style implementation -- > it is a client library and built on top of hbase. This approach seems to be > more "HBase-style" with it's emphasis consistency and atomicity, and seems > to require only a few mondifications to HBase core. Sure it likely slower > than my read of Jesse's proposal, but it seems always always consistent and > thus predictable in cases where there are failures on deletes and updates. > We'd need HBase API primitives like checkAndMutate call (check with > multiple delete/put on the same row), and possibly an atomic multitable > bulkload. I'm not sure that it is replication compatible, and there are > probably questions we'll need to answer once snapshots solidifies. > > Ted's idea of colocating regions (like the index table's > regions) definitely feels like a primitive (pluggable, likely-per-table > region assignment plans) that we could add to HBase core. This requirement > though for 2ndary indexes seems to imply an approach similar to cassandra's > approach -- having a local index of each region on region server and > colocating them. Is this right? If so, this is essentially a filtering > optimization -- it would mean that a query based on secondary index would > potentially have to hit every region server that has a region in the > primary table. This is great approach if the index lookup has high > cardinality but if the secondary index is highly selective, you'd have to > march through a bunch or RS's before getting an answer. > > Jon. > > On Tue, Aug 28, 2012 at 9:18 PM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > Hi > > > > Yes I was talking about the dead entry in the index table rather than the > > actual data table. > > > > Regards > > Ram > > > > > -----Original Message----- > > > From: Wei Tan [mailto:[EMAIL PROTECTED]] > > > Sent: Tuesday, August 28, 2012 9:22 PM > > > To: [EMAIL PROTECTED] > > > Cc: Sandeep Tata > > > Subject: Re: A general question on maxVersion handling when we have > > > Secondary index tables > > > > > > Thanks for sharing a pointer to your implementation. > > > My two cents: > > > timestamp is a way to do MVCC and setting every KV with the same TS > > > will > > > get concurrency control very tricky and error prone, if not impossible
-
RE: A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-29, 15:12
When we have many to one mapping between main and secondary index table may
be we will end up in hitting many RS. If there is one to one mapping may be that is not a problem. Basically my intention of this discussion was mainly to discuss on the version maintenance on any type of secondary index particularly to remove the stale data in the index table that would have expired. Regards Ram > -----Original Message----- > From: Ted Yu [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 29, 2012 7:45 PM > To: [EMAIL PROTECTED] > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > Thanks for the detailed response, Jon. > > bq. it would mean that a query based on secondary index would > potentially have to hit every region server that has a region in the > primary table. > > Can you elaborate on the above a little bit ? > Is this because secondary index would point us to more than one region > in > the data table because several versions are saved for the same row ? > > My thinking was to ease management of simultaneous (data and index) > region > split through region colocation. > > Cheers > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > wrote: > > > I'm more of a fan of having secondary indexes added as an external > feature > > (coproc or new client library on top of our current client library) > and > > focusing on only adding apis necessary to make 2ndary indexes > possible and > > correct on/in HBase. There are many different use patterns and > > requirements and one style of secondary index will not be good for > > everything. Do we only care about this working well for highly > selectivity > > keys? What are possible indexes (col name, value, value prefix, > everything > > our filters support?) Do we care more about writes or reads, ACID > > correctness or speed, etc? Also, there are several questions about > how we > > handle other features in conjunction with 2ndary indexes: > replication, bulk > > load, snapshots, to name a few. > > > > Maybe it makes sense to spend some time defining what we want to > index > > secondarily and what a user api to this external api would be. Then > we > > could have the different implementations under-the-covers, and allow > for > > users to swap implementations for the tradeoffs that fit their use > cases. > > It wouldn't be free to change but hopefully "easy" from a user point > of > > view. > > > > Personally, I've tend to favor more of a percolator-style > implementation -- > > it is a client library and built on top of hbase. This approach seems > to be > > more "HBase-style" with it's emphasis consistency and atomicity, and > seems > > to require only a few mondifications to HBase core. Sure it likely > slower > > than my read of Jesse's proposal, but it seems always always > consistent and > > thus predictable in cases where there are failures on deletes and > updates. > > We'd need HBase API primitives like checkAndMutate call (check with > > multiple delete/put on the same row), and possibly an atomic > multitable > > bulkload. I'm not sure that it is replication compatible, and there > are > > probably questions we'll need to answer once snapshots solidifies. > > > > Ted's idea of colocating regions (like the index table's > > regions) definitely feels like a primitive (pluggable, likely-per- > table > > region assignment plans) that we could add to HBase core. This > requirement > > though for 2ndary indexes seems to imply an approach similar to > cassandra's > > approach -- having a local index of each region on region server and > > colocating them. Is this right? If so, this is essentially a > filtering > > optimization -- it would mean that a query based on secondary index > would > > potentially have to hit every region server that has a region in the > > primary table. This is great approach if the index lookup has high > > cardinality but if the secondary index is highly selective, you'd
-
Re: A general question on maxVersion handling when we have Secondary index tablesJonathan Hsieh 2012-08-29, 16:11
Ted,
Ram's summarizes the concern succinctly -- to answer the specific question it isn't for versions -- it is for the case where a secondary index can point to many many primary rows. (let's say we have a rowkey userid and we want to have a 2ndary index based on the state portion of there address --- we'll end up pointing to many many primary rows). Jon. On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > Thanks for the detailed response, Jon. > > bq. it would mean that a query based on secondary index would > potentially have to hit every region server that has a region in the > primary table. > > Can you elaborate on the above a little bit ? > Is this because secondary index would point us to more than one region in > the data table because several versions are saved for the same row ? > > My thinking was to ease management of simultaneous (data and index) region > split through region colocation. > > Cheers > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote: > > > I'm more of a fan of having secondary indexes added as an external > feature > > (coproc or new client library on top of our current client library) and > > focusing on only adding apis necessary to make 2ndary indexes possible > and > > correct on/in HBase. There are many different use patterns and > > requirements and one style of secondary index will not be good for > > everything. Do we only care about this working well for highly > selectivity > > keys? What are possible indexes (col name, value, value prefix, > everything > > our filters support?) Do we care more about writes or reads, ACID > > correctness or speed, etc? Also, there are several questions about how > we > > handle other features in conjunction with 2ndary indexes: replication, > bulk > > load, snapshots, to name a few. > > > > Maybe it makes sense to spend some time defining what we want to index > > secondarily and what a user api to this external api would be. Then we > > could have the different implementations under-the-covers, and allow for > > users to swap implementations for the tradeoffs that fit their use cases. > > It wouldn't be free to change but hopefully "easy" from a user point of > > view. > > > > Personally, I've tend to favor more of a percolator-style implementation > -- > > it is a client library and built on top of hbase. This approach seems to > be > > more "HBase-style" with it's emphasis consistency and atomicity, and > seems > > to require only a few mondifications to HBase core. Sure it likely slower > > than my read of Jesse's proposal, but it seems always always consistent > and > > thus predictable in cases where there are failures on deletes and > updates. > > We'd need HBase API primitives like checkAndMutate call (check with > > multiple delete/put on the same row), and possibly an atomic multitable > > bulkload. I'm not sure that it is replication compatible, and there are > > probably questions we'll need to answer once snapshots solidifies. > > > > Ted's idea of colocating regions (like the index table's > > regions) definitely feels like a primitive (pluggable, likely-per-table > > region assignment plans) that we could add to HBase core. This > requirement > > though for 2ndary indexes seems to imply an approach similar to > cassandra's > > approach -- having a local index of each region on region server and > > colocating them. Is this right? If so, this is essentially a filtering > > optimization -- it would mean that a query based on secondary index > would > > potentially have to hit every region server that has a region in the > > primary table. This is great approach if the index lookup has high > > cardinality but if the secondary index is highly selective, you'd have to > > march through a bunch or RS's before getting an answer. > > > > Jon. > > > > On Tue, Aug 28, 2012 at 9:18 PM, Ramkrishna.S.Vasudevan < > > [EMAIL PROTECTED]> wrote: > > > > > Hi > > > > > > Yes I was talking about the dead entry in the index table rather than // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED]
-
Re: A general question on maxVersion handling when we have Secondary index tablesTed Yu 2012-08-29, 16:19
For the secondary index based on state portion of address example, I wonder
if we can achieve comparable performance using scan with proper filter. Cheers On Wed, Aug 29, 2012 at 9:11 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote: > Ted, > > Ram's summarizes the concern succinctly -- to answer the specific question > it isn't for versions -- it is for the case where a secondary index can > point to many many primary rows. (let's say we have a rowkey userid and we > want to have a 2ndary index based on the state portion of there address > --- we'll end up pointing to many many primary rows). > > Jon. > > > > On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > Thanks for the detailed response, Jon. > > > > bq. it would mean that a query based on secondary index would > > potentially have to hit every region server that has a region in the > > primary table. > > > > Can you elaborate on the above a little bit ? > > Is this because secondary index would point us to more than one region in > > the data table because several versions are saved for the same row ? > > > > My thinking was to ease management of simultaneous (data and index) > region > > split through region colocation. > > > > Cheers > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > wrote: > > > > > I'm more of a fan of having secondary indexes added as an external > > feature > > > (coproc or new client library on top of our current client library) and > > > focusing on only adding apis necessary to make 2ndary indexes possible > > and > > > correct on/in HBase. There are many different use patterns and > > > requirements and one style of secondary index will not be good for > > > everything. Do we only care about this working well for highly > > selectivity > > > keys? What are possible indexes (col name, value, value prefix, > > everything > > > our filters support?) Do we care more about writes or reads, ACID > > > correctness or speed, etc? Also, there are several questions about how > > we > > > handle other features in conjunction with 2ndary indexes: replication, > > bulk > > > load, snapshots, to name a few. > > > > > > Maybe it makes sense to spend some time defining what we want to index > > > secondarily and what a user api to this external api would be. Then we > > > could have the different implementations under-the-covers, and allow > for > > > users to swap implementations for the tradeoffs that fit their use > cases. > > > It wouldn't be free to change but hopefully "easy" from a user point > of > > > view. > > > > > > Personally, I've tend to favor more of a percolator-style > implementation > > -- > > > it is a client library and built on top of hbase. This approach seems > to > > be > > > more "HBase-style" with it's emphasis consistency and atomicity, and > > seems > > > to require only a few mondifications to HBase core. Sure it likely > slower > > > than my read of Jesse's proposal, but it seems always always consistent > > and > > > thus predictable in cases where there are failures on deletes and > > updates. > > > We'd need HBase API primitives like checkAndMutate call (check with > > > multiple delete/put on the same row), and possibly an atomic multitable > > > bulkload. I'm not sure that it is replication compatible, and there > are > > > probably questions we'll need to answer once snapshots solidifies. > > > > > > Ted's idea of colocating regions (like the index table's > > > regions) definitely feels like a primitive (pluggable, likely-per-table > > > region assignment plans) that we could add to HBase core. This > > requirement > > > though for 2ndary indexes seems to imply an approach similar to > > cassandra's > > > approach -- having a local index of each region on region server and > > > colocating them. Is this right? If so, this is essentially a > filtering > > > optimization -- it would mean that a query based on secondary index > > would > > > potentially have to hit every region server that has a region in the
-
Re: A general question on maxVersion handling when we have Secondary index tablesJesse Yates 2012-08-29, 17:03
Client library style stuff is _nice_ but one of the things everyone asks of
database is that we provide an index (cassandra has it, riak has it, mysql has it...hbase doesn't? Yes, different systems,etc.,etc., but the point is we could do it). Further, if we build it as a part of hbase, we can make it faster... though don't ask me the _how_ on that yet ;) Talking with Lars, we could provide a lot of the indexing infrastructure, but leave the actual indexing (convert row|cf|cq|ts|value to an index value and vice-versa) to a client library gives us a lot of the flexibility that people would need. And I take it that most people already have some form of indexing already (be it consistent or not), so we can do it 'the right way' in terms of queries, etc. and provide pluggable infrastructure (with a decent default) so people can roll in their own implementations. That said, I think we can do secondary indexing without too many changes to HBase (region co-location/pinning that Ted suggests would just be sweet overall)arguing for a client library. However, if we decide this is one of the things we want to support going forward as a project, then it makes more sense to do it as part of HBase, rather than pointing people to some guy/gal's website with the information (which may or may not be up to date) for how munge indexing in. Instead, it would be so much nicer to just flip a couple switches, maybe plug in a couple of classes and have indexing _just work_. Just my $0.02 -Jesse ------------------- Jesse Yates @jesse_yates jyates.github.com On Wed, Aug 29, 2012 at 9:19 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > For the secondary index based on state portion of address example, I wonder > if we can achieve comparable performance using scan with proper filter. > > Cheers > > On Wed, Aug 29, 2012 at 9:11 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote: > > > Ted, > > > > Ram's summarizes the concern succinctly -- to answer the specific > question > > it isn't for versions -- it is for the case where a secondary index can > > point to many many primary rows. (let's say we have a rowkey userid and > we > > want to have a 2ndary index based on the state portion of there address > > --- we'll end up pointing to many many primary rows). > > > > Jon. > > > > > > > > On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > Thanks for the detailed response, Jon. > > > > > > bq. it would mean that a query based on secondary index would > > > potentially have to hit every region server that has a region in the > > > primary table. > > > > > > Can you elaborate on the above a little bit ? > > > Is this because secondary index would point us to more than one region > in > > > the data table because several versions are saved for the same row ? > > > > > > My thinking was to ease management of simultaneous (data and index) > > region > > > split through region colocation. > > > > > > Cheers > > > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > > wrote: > > > > > > > I'm more of a fan of having secondary indexes added as an external > > > feature > > > > (coproc or new client library on top of our current client library) > and > > > > focusing on only adding apis necessary to make 2ndary indexes > possible > > > and > > > > correct on/in HBase. There are many different use patterns and > > > > requirements and one style of secondary index will not be good for > > > > everything. Do we only care about this working well for highly > > > selectivity > > > > keys? What are possible indexes (col name, value, value prefix, > > > everything > > > > our filters support?) Do we care more about writes or reads, ACID > > > > correctness or speed, etc? Also, there are several questions about > how > > > we > > > > handle other features in conjunction with 2ndary indexes: > replication, > > > bulk > > > > load, snapshots, to name a few. > > > > > > > > Maybe it makes sense to spend some time defining what we want to
-
Re: A general question on maxVersion handling when we have Secondary index tablesTed Yu 2012-08-29, 17:07
I agree with Jesse.
For the initial implementation, we can pick a common use case. When users present more use cases, we add more support in HBase core. On Wed, Aug 29, 2012 at 10:03 AM, Jesse Yates <[EMAIL PROTECTED]>wrote: > Client library style stuff is _nice_ but one of the things everyone asks of > database is that we provide an index (cassandra has it, riak has it, mysql > has it...hbase doesn't? Yes, different systems,etc.,etc., but the point is > we could do it). Further, if we build it as a part of hbase, we can make it > faster... though don't ask me the _how_ on that yet ;) > > Talking with Lars, we could provide a lot of the indexing infrastructure, > but leave the actual indexing (convert row|cf|cq|ts|value to an index value > and vice-versa) to a client library gives us a lot of the flexibility that > people would need. And I take it that most people already have some form of > indexing already (be it consistent or not), so we can do it 'the right way' > in terms of queries, etc. and provide pluggable infrastructure (with a > decent default) so people can roll in their own implementations. > > That said, I think we can do secondary indexing without too many changes to > HBase (region co-location/pinning that Ted suggests would just be sweet > overall)arguing for a client library. However, if we decide this is one of > the things we want to support going forward as a project, then it makes > more sense to do it as part of HBase, rather than pointing people to some > guy/gal's website with the information (which may or may not be up to date) > for how munge indexing in. Instead, it would be so much nicer to just flip > a couple switches, maybe plug in a couple of classes and have indexing > _just work_. > > Just my $0.02 > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Wed, Aug 29, 2012 at 9:19 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > For the secondary index based on state portion of address example, I > wonder > > if we can achieve comparable performance using scan with proper filter. > > > > Cheers > > > > On Wed, Aug 29, 2012 at 9:11 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > wrote: > > > > > Ted, > > > > > > Ram's summarizes the concern succinctly -- to answer the specific > > question > > > it isn't for versions -- it is for the case where a secondary index can > > > point to many many primary rows. (let's say we have a rowkey userid > and > > we > > > want to have a 2ndary index based on the state portion of there address > > > --- we'll end up pointing to many many primary rows). > > > > > > Jon. > > > > > > > > > > > > On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > > > Thanks for the detailed response, Jon. > > > > > > > > bq. it would mean that a query based on secondary index would > > > > potentially have to hit every region server that has a region in the > > > > primary table. > > > > > > > > Can you elaborate on the above a little bit ? > > > > Is this because secondary index would point us to more than one > region > > in > > > > the data table because several versions are saved for the same row ? > > > > > > > > My thinking was to ease management of simultaneous (data and index) > > > region > > > > split through region colocation. > > > > > > > > Cheers > > > > > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > I'm more of a fan of having secondary indexes added as an external > > > > feature > > > > > (coproc or new client library on top of our current client library) > > and > > > > > focusing on only adding apis necessary to make 2ndary indexes > > possible > > > > and > > > > > correct on/in HBase. There are many different use patterns and > > > > > requirements and one style of secondary index will not be good for > > > > > everything. Do we only care about this working well for highly > > > > selectivity > > > > > keys? What are possible indexes (col name, value, value prefix,
-
Re: A general question on maxVersion handling when we have Secondary index tablesJonathan Hsieh 2012-08-29, 17:46
Let me rephrase to make sure I'm on the same page for the ram's question:
We do three inserts on row 1 at different times to the same column (which is being indexed in a secondary table) (Are we assuming only a 1-to-1 secondary->primary mapping?) t1< t2 <t3 put ("row1", "cf:c", "val1", t1) put ("row1", "cf:c", "val2", t2) put ("row1", "cf:c", "val3", t3) What happens is in the primary table we have: row1 / cf:c = val1 @ t1 row1 / cf:c = val3 @ t2 row1 / cf:c = val3 @ t3 I'm assuming that these writes happen to a secondary table like this: put ("val1", "r", "row1", t1) put ("val2", "r", "row1", t2) put ("val3", "r", "row1", t3) an in the secondary table we have: val1 / r = row1 @ t1 val2 / r = row1 @ t2 val3 / r = row1 @ t3 The core question is how and when can we efficiently and correctly get rid of the now invalid val1, val2 rows in the index table. Let's look at some of the strawmen: 1) periodic scan of secondary table that add delete markers for invalid entries (removed on major compact) 2) lazily delete marker on reads that are invalid (we are @t4, attempt to read via "val2" in 2ndary index, see primary value is invalid so do a checkAndDelete val2 from 2ndary). would get removed on major compact. 3) delete on update. This means we need to know if we are modifying a value and thus incurs a at least an extra read per write. Ram, does this seem like the right question and potential options to consider? Jon. On Wed, Aug 29, 2012 at 8:12 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote: > When we have many to one mapping between main and secondary index table may > be we will end up in hitting many RS. If there is one to one mapping may be > that is not a problem. > > Basically my intention of this discussion was mainly to discuss on the > version maintenance on any type of secondary index particularly to remove > the stale data in the index table that would have expired. > > Regards > Ram > > > > -----Original Message----- > > From: Ted Yu [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, August 29, 2012 7:45 PM > > To: [EMAIL PROTECTED] > > Subject: Re: A general question on maxVersion handling when we have > > Secondary index tables > > > > Thanks for the detailed response, Jon. > > > > bq. it would mean that a query based on secondary index would > > potentially have to hit every region server that has a region in the > > primary table. > > > > Can you elaborate on the above a little bit ? > > Is this because secondary index would point us to more than one region > > in > > the data table because several versions are saved for the same row ? > > > > My thinking was to ease management of simultaneous (data and index) > > region > > split through region colocation. > > > > Cheers > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > > wrote: > > > > > I'm more of a fan of having secondary indexes added as an external > > feature > > > (coproc or new client library on top of our current client library) > > and > > > focusing on only adding apis necessary to make 2ndary indexes > > possible and > > > correct on/in HBase. There are many different use patterns and > > > requirements and one style of secondary index will not be good for > > > everything. Do we only care about this working well for highly > > selectivity > > > keys? What are possible indexes (col name, value, value prefix, > > everything > > > our filters support?) Do we care more about writes or reads, ACID > > > correctness or speed, etc? Also, there are several questions about > > how we > > > handle other features in conjunction with 2ndary indexes: > > replication, bulk > > > load, snapshots, to name a few. > > > > > > Maybe it makes sense to spend some time defining what we want to > > index > > > secondarily and what a user api to this external api would be. Then > > we > > > could have the different implementations under-the-covers, and allow > > for > > > users to swap implementations for the tradeoffs that fit their use // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED]
-
Re: A general question on maxVersion handling when we have Secondary index tablesJonathan Hsieh 2012-08-29, 18:18
We should have a hbave dev meeting and bring this as one of the topics of
discussion to bring up. I'll start another thread on that. On Wed, Aug 29, 2012 at 10:03 AM, Jesse Yates <[EMAIL PROTECTED]>wrote: > Client library style stuff is _nice_ but one of the things everyone asks of > database is that we provide an index (cassandra has it, riak has it, mysql > has it...hbase doesn't? Yes, different systems,etc.,etc., but the point is > we could do it). Further, if we build it as a part of hbase, we can make it > faster... though don't ask me the _how_ on that yet ;) > > My main concern is that there are many possible ways to have an implementation that is good for one usecase / workload but will absolutely terrible for others. > Talking with Lars, we could provide a lot of the indexing infrastructure, > but leave the actual indexing (convert row|cf|cq|ts|value to an index value > and vice-versa) to a client library gives us a lot of the flexibility that > people would need. And I take it that most people already have some form of > indexing already (be it consistent or not), so we can do it 'the right way' > in terms of queries, etc. and provide pluggable infrastructure (with a > decent default) so people can roll in their own implementations. > > That said, I think we can do secondary indexing without too many changes to > HBase (region co-location/pinning that Ted suggests would just be sweet > overall)arguing for a client library. However, if we decide this is one of > the things we want to support going forward as a project, then it makes > more sense to do it as part of HBase, rather than pointing people to some > guy/gal's website with the information (which may or may not be up to date) > for how munge indexing in. Instead, it would be so much nicer to just flip > a couple switches, maybe plug in a couple of classes and have indexing > _just work_. > > Isn't that the rationale for coprocessors? (just add something to a config, start hbase?) Also, with secondary indices, we'll potentially be adding new user exposed apis. I think this should be defineable in a way that can work accross many algorithms. We should figure out what they are so when there are different implementations users can pick and choose between the implementations that are good for them. > Just my $0.02 > > -Jesse > ------------------- > Jesse Yates > @jesse_yates > jyates.github.com > > > On Wed, Aug 29, 2012 at 9:19 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > For the secondary index based on state portion of address example, I > wonder > > if we can achieve comparable performance using scan with proper filter. > > > > Cheers > > > > On Wed, Aug 29, 2012 at 9:11 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > wrote: > > > > > Ted, > > > > > > Ram's summarizes the concern succinctly -- to answer the specific > > question > > > it isn't for versions -- it is for the case where a secondary index can > > > point to many many primary rows. (let's say we have a rowkey userid > and > > we > > > want to have a 2ndary index based on the state portion of there address > > > --- we'll end up pointing to many many primary rows). > > > > > > Jon. > > > > > > > > > > > > On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > > > Thanks for the detailed response, Jon. > > > > > > > > bq. it would mean that a query based on secondary index would > > > > potentially have to hit every region server that has a region in the > > > > primary table. > > > > > > > > Can you elaborate on the above a little bit ? > > > > Is this because secondary index would point us to more than one > region > > in > > > > the data table because several versions are saved for the same row ? > > > > > > > > My thinking was to ease management of simultaneous (data and index) > > > region > > > > split through region colocation. > > > > > > > > Cheers > > > > > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > > > wrote: > > > > > > > // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED]
-
Re: A general question on maxVersion handling when we have Secondary index tablesStack 2012-08-29, 22:32
On Tue, Aug 28, 2012 at 9:03 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> I think this discussion should be on HBASE JIRA. I disagree (I've already quoted you the 'discussion on mailing list, not in JIRA' section from the Karl Fogel book). St.Ack
-
RE: A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-30, 04:18
Yes Jon. You got it right. This is the problem. But all the
implementation we need to have some type of mechanism where we go thro all the rows in the sec index table. Suppose in the below example if I say in my main table maxVersions is 5. So I will scan the top 5 values from the sec index table and once I get the 6th value I need to delete the first one from the sec index table. This involves some type of cache or map where I can keep incrementing the count for every row that we get. And whenever I see I have value which is more than maxVersions delete the oldest one. We also thought of another option though it is slower ->Scan one row in Sec table. ->Extract the actual row key of the main table and scan the main table using that. Here I will be getting only the required version entries. -> Now based on these entries delete the expired entries from the sec index table. Thought of doing this in Compaction time.(Major). But doing this has one problem like when ever we do compaction we deal with direct store level scanners. Even if we try to use the new hooks added by Lars H preCompactScannerOpen(), This scanner always expects the kvs to be ordered. But we may not be able to get them in order if we try the way mentioned here. We also felt that if we have a hook while filtering out the expired KVs may be we can try using this? But need to check how much it is efficient. So the suggestion given by Jon is one of the option but it involves more caching and we may need to go for a persistant caching also if the size goes increasing. Thanks to all for providing your suggestions. Regards Ram > -----Original Message----- > From: Jonathan Hsieh [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 29, 2012 11:16 PM > To: [EMAIL PROTECTED] > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > Let me rephrase to make sure I'm on the same page for the ram's > question: > > We do three inserts on row 1 at different times to the same column > (which > is being indexed in a secondary table) (Are we assuming only a 1-to-1 > secondary->primary mapping?) > > t1< t2 <t3 > put ("row1", "cf:c", "val1", t1) > put ("row1", "cf:c", "val2", t2) > put ("row1", "cf:c", "val3", t3) > > What happens is in the primary table we have: > > row1 / cf:c = val1 @ t1 > row1 / cf:c = val3 @ t2 > row1 / cf:c = val3 @ t3 > > I'm assuming that these writes happen to a secondary table like this: > put ("val1", "r", "row1", t1) > put ("val2", "r", "row1", t2) > put ("val3", "r", "row1", t3) > > an in the secondary table we have: > > val1 / r = row1 @ t1 > val2 / r = row1 @ t2 > val3 / r = row1 @ t3 > > The core question is how and when can we efficiently and correctly get > rid > of the now invalid val1, val2 rows in the index table. > > Let's look at some of the strawmen: > 1) periodic scan of secondary table that add delete markers for invalid > entries (removed on major compact) > 2) lazily delete marker on reads that are invalid (we are @t4, attempt > to > read via "val2" in 2ndary index, see primary value is invalid so do a > checkAndDelete val2 from 2ndary). would get removed on major compact. > 3) delete on update. This means we need to know if we are modifying a > value and thus incurs a at least an extra read per write. > > Ram, does this seem like the right question and potential options to > consider? > > Jon. > > On Wed, Aug 29, 2012 at 8:12 AM, Ramkrishna.S.Vasudevan < > [EMAIL PROTECTED]> wrote: > > > When we have many to one mapping between main and secondary index > table may > > be we will end up in hitting many RS. If there is one to one mapping > may be > > that is not a problem. > > > > Basically my intention of this discussion was mainly to discuss on > the > > version maintenance on any type of secondary index particularly to > remove > > the stale data in the index table that would have expired. > > > > Regards > > Ram > > > > > > > -----Original Message-----
-
RE: A general question on maxVersion handling when we have Secondary index tablesRamkrishna.S.Vasudevan 2012-08-30, 04:34
Reg , the collocation part of the main table regions and index table
regions, that is pretty much necessary. Reg, how secondary index feature can be supported either as external or core-> I would say that seeing the current things that we have done it can be like security means secondary index can be supplied along with the core and if we base our impl based on coprocessors overall changes to the kernel seems to be minimal and if we are ok in having secondary index feature along with the core then those changes become inevitable and at the same time useful too. Regards Ram > -----Original Message----- > From: Ted Yu [mailto:[EMAIL PROTECTED]] > Sent: Wednesday, August 29, 2012 9:50 PM > To: [EMAIL PROTECTED] > Subject: Re: A general question on maxVersion handling when we have > Secondary index tables > > For the secondary index based on state portion of address example, I > wonder > if we can achieve comparable performance using scan with proper filter. > > Cheers > > On Wed, Aug 29, 2012 at 9:11 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > wrote: > > > Ted, > > > > Ram's summarizes the concern succinctly -- to answer the specific > question > > it isn't for versions -- it is for the case where a secondary index > can > > point to many many primary rows. (let's say we have a rowkey userid > and we > > want to have a 2ndary index based on the state portion of there > address > > --- we'll end up pointing to many many primary rows). > > > > Jon. > > > > > > > > On Wed, Aug 29, 2012 at 7:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > Thanks for the detailed response, Jon. > > > > > > bq. it would mean that a query based on secondary index would > > > potentially have to hit every region server that has a region in > the > > > primary table. > > > > > > Can you elaborate on the above a little bit ? > > > Is this because secondary index would point us to more than one > region in > > > the data table because several versions are saved for the same row > ? > > > > > > My thinking was to ease management of simultaneous (data and index) > > region > > > split through region colocation. > > > > > > Cheers > > > > > > On Wed, Aug 29, 2012 at 6:47 AM, Jonathan Hsieh <[EMAIL PROTECTED]> > > wrote: > > > > > > > I'm more of a fan of having secondary indexes added as an > external > > > feature > > > > (coproc or new client library on top of our current client > library) and > > > > focusing on only adding apis necessary to make 2ndary indexes > possible > > > and > > > > correct on/in HBase. There are many different use patterns and > > > > requirements and one style of secondary index will not be good > for > > > > everything. Do we only care about this working well for highly > > > selectivity > > > > keys? What are possible indexes (col name, value, value prefix, > > > everything > > > > our filters support?) Do we care more about writes or reads, > ACID > > > > correctness or speed, etc? Also, there are several questions > about how > > > we > > > > handle other features in conjunction with 2ndary indexes: > replication, > > > bulk > > > > load, snapshots, to name a few. > > > > > > > > Maybe it makes sense to spend some time defining what we want to > index > > > > secondarily and what a user api to this external api would be. > Then we > > > > could have the different implementations under-the-covers, and > allow > > for > > > > users to swap implementations for the tradeoffs that fit their > use > > cases. > > > > It wouldn't be free to change but hopefully "easy" from a user > point > > of > > > > view. > > > > > > > > Personally, I've tend to favor more of a percolator-style > > implementation > > > -- > > > > it is a client library and built on top of hbase. This approach > seems > > to > > > be > > > > more "HBase-style" with it's emphasis consistency and atomicity, > and > > > seems > > > > to require only a few mondifications to HBase core. Sure it > likely > > slower > > > > than my read of Jesse's proposal, but it seems always always |