Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Programming practices for implementing composite row keys

Copy link to this message
Re: Programming practices for implementing composite row keys
Shahab Yunus 2013-09-05, 14:14
My 2 cents:

1- Yes, that is one way to do it. You can also use fixed length for every
attribute participating in the composite key. HBase scan would be more
fitting to this pattern as well, I believe (?) It's a trade-off basically
between space (all that padding increasing the key size) versus
complexities involved in deciding and handling a delimiter and consequent
parsing of keys etc.

2- I personally have not heard about this. As far as I understand, this
goes against the whole idea of HBase scanning and prefix and fuzzy filters
will not be possible this way. This should not be followed.

3- See replies to 1 & 2

4- The sorting of the keys, by default, is binary comparator. It is a bit
tricky as far as I know and the last I checked. Some tips here:

Can you normalize them (or take an absolute) before reading and writing (of
course at the cost of performance) if it is possible i.e. keys with same
amount but different magnitude cannot exist as well as different entities.
This depends on your business logic and type/nature of data.

On Thu, Sep 5, 2013 at 10:03 AM, praveenesh kumar <[EMAIL PROTECTED]>wrote:

> Hello people,
> I have a scenario which requires creating composite row keys for my hbase
> table.
> Basically it would be <entity1,entity2,entity3>.
> Search would be based by entity1 and then entity2 and 3.. I know I can do
> row <start-stop>scan on entity1 first and then put row filters on entity2
> and entity3.
> My question is what are the best programming principles to implement these
> keys.
> 1. Just use simple delimiters <entity1:entity2:entity3>.
> 2. Create complex datatypes like java structures. I don't know if anyone
> uses structures as keys and if they do, can someone please highlight me for
> which scenarios they would be good fit. Does they fit good for this
> scenario.
> 3. What are the pros and cons for both 1 and 2, when it comes for data
> retrieval.
> 4. My <entity1> can be negative also. Does it make any special difference
> when hbase ordering is concerned. How can I tackle this scenario.
> Any help on how to implement composite row keys would be highly helpful. I
> want to understand how the community deals with implementing composite row
> keys.
> Regards
> Praveenesh