|
Lukáš Drbal
2012-08-12, 12:45
Otis Gospodnetic
2012-08-13, 21:49
Michael Segel
2012-08-14, 00:28
lars hofhansl
2012-08-14, 00:42
Lukáš Drbal
2012-08-14, 11:44
Michael Segel
2012-08-14, 11:55
lars hofhansl
2012-08-15, 00:08
Andrew Purtell
2012-08-15, 01:59
Michael Segel
2012-08-15, 02:38
Andrew Purtell
2012-08-15, 02:49
Michael Segel
2012-08-15, 03:01
lars hofhansl
2012-08-15, 03:57
|
-
Secondary indexes suggestionsLukáš Drbal 2012-08-12, 12:45
Hi all,
iam new user of Hbase and i need help with secondary indexes. For example i have messages and users. Each user has many messages. Data structure will be like this: Message: - String id - Long sender_id - Long recipient_id - String text - Timestamp created_at [...] User: - Long id - String username [...] I need create secondary indexes for reading all messages: a) inbox (by recipient_id) in timerange. b) outbox (by sender_id) in timerange Can someone give me suggestions for this index(es) and attributes for columnFamily? I expect here 500M messages and 50M users. Thanks a lot for response. P.S. Sorry for my bad english, isn't my primary language Lukas Drbal
-
Re: Secondary indexes suggestionsOtis Gospodnetic 2012-08-13, 21:49
Lukáš, have a look at this recent post on this topic:
http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/ Otis ---- Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm >________________________________ > From: Luk� Drbal <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED] >Sent: Sunday, August 12, 2012 8:15 AM >Subject: Secondary indexes suggestions > >Hi all, > >iam new user of Hbase and i need help with secondary indexes. > >For example i have messages and users. Each user has many messages. >Data structure will be like this: > >Message: >- String id >- Long sender_id >- Long recipient_id >- String text >- Timestamp created_at >[...] > >User: >- Long id >- String username >[...] > >I need create secondary indexes for reading all messages: >a) inbox (by recipient_id) in timerange. >b) outbox (by sender_id) in timerange > >Can someone give me suggestions for this index(es) and attributes for >columnFamily? >I expect here 500M messages and 50M users. > >Thanks a lot for response. > > >P.S. Sorry for my bad english, isn't my primary language > > >Lukas Drbal > > >
-
Re: Secondary indexes suggestionsMichael Segel 2012-08-14, 00:28
Not really a good idea or anything new.
Essentially a full table scan where you're doing a closer inspection on the key to see if it matches your search regex, before actually fetching the entire row and returning it. Secondary indexes are pretty straight forward. You have your primary key and then your value. Secondary index has a table where the key be one of your values from the main base table, and then the value is the key from the base table. So if your main key is 12345, and you store {'Fred', 'Cleveland', 'Ohio'} == {Name, City, State} You could create an index on State where you store 'Ohio' as the key, and a column value of 12345. Then if you search the second table on a row with the key 'Ohio', you'll get all the rows where there is a record in the base table. In this example. a row with the key '12345' ... HTH On Aug 13, 2012, at 4:49 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Lukáš, have a look at this recent post on this topic: > > > http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/ > > > Otis > ---- > Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm > > > >> ________________________________ >> From: Lukáš Drbal <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Sunday, August 12, 2012 8:15 AM >> Subject: Secondary indexes suggestions >> >> Hi all, >> >> iam new user of Hbase and i need help with secondary indexes. >> >> For example i have messages and users. Each user has many messages. >> Data structure will be like this: >> >> Message: >> - String id >> - Long sender_id >> - Long recipient_id >> - String text >> - Timestamp created_at >> [...] >> >> User: >> - Long id >> - String username >> [...] >> >> I need create secondary indexes for reading all messages: >> a) inbox (by recipient_id) in timerange. >> b) outbox (by sender_id) in timerange >> >> Can someone give me suggestions for this index(es) and attributes for >> columnFamily? >> I expect here 500M messages and 50M users. >> >> Thanks a lot for response. >> >> >> P.S. Sorry for my bad english, isn't my primary language >> >> >> Lukas Drbal >> >>
-
Re: Secondary indexes suggestionslars hofhansl 2012-08-14, 00:42
Secondary indexes are only simple when you ignore concurrent updates and failing clients.
A client could manage to write the index first and then fail in the main row (that can be handled by always rechecking the main row and always scan all versions of the index rows, which is hard/expensive in a scan). You can also have a WAL, which you check upon each read and reapply all outstanding changes. (2ndary index updates are nice in that they are idempotent). Similarly there are other scenarios that make this hard, and is the reason why HBase doesn't have them. We've been thinking about primitives to add to HBase to make building/using of 2ndary indexes easier/feasible. Should indexes be global (i.e. it is up to a client or coprocessor to gather then matches and requery the actual rows)? Or local (which means a query needs to farm many queries in parallel to all index sites)? Both have pros and cons. I think the key of Fuzzy filter is that it can actually seek ahead (using the HBase Filter seek hints), which has the potential to be far more efficient than a full scan. In fact local indexes would probably implemented that way: You always scan the main table and use the index information seek ahead. Just my $0.02, though. :) -- Lars ----- Original Message ----- From: Michael Segel <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; Otis Gospodnetic <[EMAIL PROTECTED]> Cc: Sent: Monday, August 13, 2012 5:28 PM Subject: Re: Secondary indexes suggestions Not really a good idea or anything new. Essentially a full table scan where you're doing a closer inspection on the key to see if it matches your search regex, before actually fetching the entire row and returning it. Secondary indexes are pretty straight forward. You have your primary key and then your value. Secondary index has a table where the key be one of your values from the main base table, and then the value is the key from the base table. So if your main key is 12345, and you store {'Fred', 'Cleveland', 'Ohio'} == {Name, City, State} You could create an index on State where you store 'Ohio' as the key, and a column value of 12345. Then if you search the second table on a row with the key 'Ohio', you'll get all the rows where there is a record in the base table. In this example. a row with the key '12345' ... HTH On Aug 13, 2012, at 4:49 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Lukáš, have a look at this recent post on this topic: > > > http://blog.sematext.com/2012/08/09/consider-using-fuzzyrowfilter-when-in-need-for-secondary-indexes-in-hbase/ > > > Otis > ---- > Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm > > > >> ________________________________ >> From: Lukáš Drbal <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Sent: Sunday, August 12, 2012 8:15 AM >> Subject: Secondary indexes suggestions >> >> Hi all, >> >> iam new user of Hbase and i need help with secondary indexes. >> >> For example i have messages and users. Each user has many messages. >> Data structure will be like this: >> >> Message: >> - String id >> - Long sender_id >> - Long recipient_id >> - String text >> - Timestamp created_at >> [...] >> >> User: >> - Long id >> - String username >> [...] >> >> I need create secondary indexes for reading all messages: >> a) inbox (by recipient_id) in timerange. >> b) outbox (by sender_id) in timerange >> >> Can someone give me suggestions for this index(es) and attributes for >> columnFamily? >> I expect here 500M messages and 50M users. >> >> Thanks a lot for response. >> >> >> P.S. Sorry for my bad english, isn't my primary language >> >> >> Lukas Drbal >> >>
-
Re: Secondary indexes suggestionsLukáš Drbal 2012-08-14, 11:44
Hi,
thanks a lot for all response. Otis: filter from your link are great, i'll check it in my tests. Michael: i understand what is secondary indexes, but still don't have idea about effective rowkey format. I'm ok with delay in creating secondary index and atomicity, we don't need "realitime" data. When i have 10 messages with ids 1, 8, 10, 255, ... from one user with id 88. I see here only 2 options for rowkey in sec. index: 1) composite rowkey like <userId><SEPARATOR><messageId> 2) use userId as rowkey and put messageId into cells Exists any other? When i use first method, i must scan over many rows. What about startRow for scanner? Can be this scan effective? Second method need many many cells and i don't need all in one time, so this is imho bad idea. -- Save The World - http://www.worldcommunitygrid.org/ http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR Lukas Drbal
-
Re: Secondary indexes suggestionsMichael Segel 2012-08-14, 11:55
Ah... schema design...
Yes you have both options identified... but just to add a twist... in the column name, prepend the (epoch - timestamp) to the message id. This will put the messages in reverse order. The only drawback to this is that its theoretically possible to create a row which exceeds your region's size.... You could also do this if you use a composite key. (Hash the user_id and then (epoch - timestamp) and then the message_id. You are correct that you have to scan many rows. However by using a start scanner that has the user_id as the start key and then end key as the user_id + the first character after the separator key. The only reason I would say to hash the key is so that you get a more even distribution of data across the cluster, but that's not really that important. On Aug 14, 2012, at 6:44 AM, Lukáš Drbal <[EMAIL PROTECTED]> wrote: > Hi, > > thanks a lot for all response. > > Otis: filter from your link are great, i'll check it in my tests. > > Michael: i understand what is secondary indexes, but still don't have > idea about effective rowkey format. I'm ok with delay in creating > secondary index and atomicity, we don't need "realitime" data. > > > When i have 10 messages with ids 1, 8, 10, 255, ... from one user with > id 88. I see here only 2 options for rowkey in sec. index: > > 1) composite rowkey like <userId><SEPARATOR><messageId> > 2) use userId as rowkey and put messageId into cells > Exists any other? > > When i use first method, i must scan over many rows. What about > startRow for scanner? Can be this scan effective? > > Second method need many many cells and i don't need all in one time, > so this is imho bad idea. > > > -- > Save The World - http://www.worldcommunitygrid.org/ > http://www.worldcommunitygrid.org/stat/viewMemberInfo.do?userName=LesTR > > Lukas Drbal >
-
Re: Secondary indexes suggestionslars hofhansl 2012-08-15, 00:08
Thanks Andy.
Yep. It's not simple if (and only if) you data is changing a lot. Michael is right though, that it is simple problem if your data is static. In my mind we should think about providing the building blocks (like the limited cross row transaction stuff I did a while back), rather then forcing a particular implementation. o some folks cannot tolerate cross region server lookups updates o others will want index-covered-queries i.e. denormalization o others will want the equivalent of materialized views o some have natural chards (or tenants, maybe), and chards should co-locates with their indexes. o etc. Todd Lipcon and I were talking last week. And he mentioned primitives like logged updates,operations that will eventually complete, and as long as log-replay can be forced before a read operation they can be used for consistent indexes. -- Lars ----- Original Message ----- From: Andrew Purtell <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]> Cc: Sent: Monday, August 13, 2012 8:11 PM Subject: Re: Secondary indexes suggestions Please pardon while I ramble, this started off as a short response and is now... lengthy. I've also seen Megastore-inspired secondary index implementations that clone the data from the primary table into the secondary table, by sort order of the attribute that is indexed. In Megastore this was configurable on a per index table basis: "Accessing entity data through indexes is normally a two-step process: first the index is read to find matching primary keys, then these keys are used to fetch entities. We provide a way to denormalize portions of entity data directly into index entries. By adding the STORING clause to an index [...]" A naive implementation of this for HBase will require consistency checking of the index table(s) because it is easy for the denormalized data to become stale in some places if a client (or coprocessor) fails mid-write or if the index update is significantly delayed from the primary table update. A non-naive implementation will have some difficult to implement correctly Paxos-ish commit protocol doubly difficult to make perform well. Without that extra layer, it is assured the index is always slightly out of date. The lag can increase substantially if index region(s) are in transition when the primary table write happens, and then the client (or coprocessor) has to wait to update the index. You could also do this in reverse, update the index table first. Either the client would have to do this as Lars says, or a background MapReduce based process might be employed, or both. Without denormalization then you have the possibility of dangling pointers in the index tables, or data in the primary table that is not fully indexed. Also these cases would have to be found and fixed, the secondary index could potentially always be in some slight state of disrepair. CCIndex (https://github.com/Jia-Liu/CCIndex) was a scheme such as the above that also reduced the replication factor of denormalized index tables to soften the storage impact of the data duplication, and patched HBase core to regenerate the index table from the primary table if one of the index HFiles became corrupt. This is a questionable idea in my opinion, but it does lead to the interesting consideration if HBase should support trapping HFile IOEs to enable this sort of thing to be built as a coprocessor. A secondary indexing coprocessor could force the colocation of regions of a primary table with the regions of index tables that map back to them. Cross-region transactions are possible within a single RegionServer. A MasterObserver could control region placement. A RegionObserver on the region of the primary table could transact with those on the regions of the index tables. A WALObserver could group the update to the primary table and indexes into a single WAL entry. Should the RegionServer crash mid transaction, all updates would be replayed from the WAL, maintaining at all times the consistency of the index(es) with respect to the primary table. But I see a number of challenges with this. Foremost, now your availability concerns are not limited to the regions of the primary table possibly being in transition, now updates to the primary table would need to block until all relevant index regions are migrated over to where the primary region are resident. It may be worth trying to do something like this, but evicting regions to make room for colocation of index table regions with primary table regions could get out of hand. After a couple of RegionServers fail, perhaps quickly after each other, would the cluster converge to full availability? Would have to be extensively tested. The above is a fair amount of (over)engineering for where the client should be oblivious to how secondary indexing is done on the cluster. If that is not a design constraint, then HBase 0.94+ has limited cross row atomicity, within a single region. So if you are able to construct primary record keys and index record keys such as they will all fall within the keyspace of a single region, then this can be done today, the client can send them up as a group packed into a single RPC and be assured of server side atomic commit. However, doing such keyspace engineering while also aiming for efficient queries could be a big challenge. On Mon, Aug 13, 2012 at 5:42 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: Secondary indexes suggestionsAndrew Purtell 2012-08-15, 01:59
Hey Lars,
On Tue, Aug 14, 2012 at 5:08 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > Yep. It's not simple if (and only if) you data is changing a lot. Michael is right though, that it is simple problem if your data is static. Yeah, a good option for that are MR processes that emit in one shot HFiles for bulk import of infrequent updates into the primary table and all projections/materializations/indices. We have an application that does this in production. > Todd Lipcon and I were talking last week. And he mentioned primitives like logged updates,operations that will eventually complete, and as long as log-replay can be forced before a read operation they can be used for consistent indexes. Back to HBASE-3340 again. :-) Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: Secondary indexes suggestionsMichael Segel 2012-08-15, 02:38
I think you need to think outside of the box...
I've thought about it a little more and while there's validity to indexing at the RS, there's a bit more of a headache. But I think you've been too dismissive of looking at the index at the table level and not at the region level. On Aug 14, 2012, at 8:59 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > Hey Lars, > > On Tue, Aug 14, 2012 at 5:08 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: >> Yep. It's not simple if (and only if) you data is changing a lot. Michael is right though, that it is simple problem if your data is static. > > Yeah, a good option for that are MR processes that emit in one shot > HFiles for bulk import of infrequent updates into the primary table > and all projections/materializations/indices. We have an application > that does this in production. > >> Todd Lipcon and I were talking last week. And he mentioned primitives like logged updates,operations that will eventually complete, and as long as log-replay can be forced before a read operation they can be used for consistent indexes. > > Back to HBASE-3340 again. :-) > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein (via Tom White) >
-
Re: Secondary indexes suggestionsAndrew Purtell 2012-08-15, 02:49
On Tue, Aug 14, 2012 at 7:38 PM, Michael Segel
<[EMAIL PROTECTED]> wrote: > I think you need to think outside of the box... > But I think you've been too dismissive of looking at the index at the table level and not at the region level. I'd be interested if you can point out exactly where I dismissed something, as in "this is not a good idea..." or "this is wrong..." or any other explicit statement. Otherwise, you are reading in something as implicit that isn't there. I contributed a few thoughts on the subject as opposed to writing a treatise. Why does this have to be an argument instead of a discussion? But if you don't mind I'm not going to look at this thread further. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)
-
Re: Secondary indexes suggestionsMichael Segel 2012-08-15, 03:01
Perhaps not dismissive but more focused on indexing at the region.
And it wasn't just you, but also Lars. Also don't read in to what I am saying as an argument. Its not. ;-P I think the issue is how to approach the problem. On Aug 14, 2012, at 9:49 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > On Tue, Aug 14, 2012 at 7:38 PM, Michael Segel > <[EMAIL PROTECTED]> wrote: >> I think you need to think outside of the box... >> But I think you've been too dismissive of looking at the index at the table level and not at the region level. > > I'd be interested if you can point out exactly where I dismissed > something, as in "this is not a good idea..." or "this is wrong..." or > any other explicit statement. Otherwise, you are reading in something > as implicit that isn't there. I contributed a few thoughts on the > subject as opposed to writing a treatise. Why does this have to be an > argument instead of a discussion? > > But if you don't mind I'm not going to look at this thread further. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein (via Tom White) >
-
Re: Secondary indexes suggestionslars hofhansl 2012-08-15, 03:57
Maybe we know one thing or the other about this :) There are pros and cons to both approaches. Nothing was dismissed.
For global, table-level indexes we need some of distributed commit protocol. For "index transactions" it is slightly simpler, because they are known ahead of time to be idempotent; maybe we can up with something less strict than 2pc/paxos. Naively updating another table and say "now we have secondary indexes" *is* going to bring unexpected surprises, as it will work until it breaks because of concurrency issues. If you want that, write to two tables from your client. I think this is a good discussion. -- Lars ________________________________ From: Michael Segel <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Cc: "[EMAIL PROTECTED] Purtell" <[EMAIL PROTECTED]>; lars hofhansl <[EMAIL PROTECTED]> Sent: Tuesday, August 14, 2012 8:01 PM Subject: Re: Secondary indexes suggestions Perhaps not dismissive but more focused on indexing at the region. And it wasn't just you, but also Lars. Also don't read in to what I am saying as an argument. Its not. ;-P I think the issue is how to approach the problem. On Aug 14, 2012, at 9:49 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > On Tue, Aug 14, 2012 at 7:38 PM, Michael Segel > <[EMAIL PROTECTED]> wrote: >> I think you need to think outside of the box... >> But I think you've been too dismissive of looking at the index at the table level and not at the region level. > > I'd be interested if you can point out exactly where I dismissed > something, as in "this is not a good idea..." or "this is wrong..." or > any other explicit statement. Otherwise, you are reading in something > as implicit that isn't there. I contributed a few thoughts on the > subject as opposed to writing a treatise. Why does this have to be an > argument instead of a discussion? > > But if you don't mind I'm not going to look at this thread further. > > Best regards, > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet > Hein (via Tom White) > |