Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> How to scan rows starting with a particular string?


+
Hari Sreekumar 2011-04-26, 09:45
+
Suraj Varma 2011-04-26, 14:05
+
Hari Sreekumar 2011-04-27, 05:43
+
Himanshu Vashishtha 2011-04-27, 06:54
+
Joe Pallas 2011-04-27, 17:00
Copy link to this message
-
Re: How to scan rows starting with a particular string?
On Wed, Apr 27, 2011 at 11:00 AM, Joe Pallas <[EMAIL PROTECTED]> wrote:

>
> On Apr 26, 2011, at 11:54 PM, Himanshu Vashishtha wrote:
>
> > HBase uses utf-8 encoding to store the row keys, so it can store
> non-ascii
> > characters too (yes they will be larger than 1 byte).
>
> That statement may be misleading.  HBase doesn't use any encoding at all,
> because row keys are simply arrays of bytes.  HBase cares only about the
> sorting order of those byte arrays, and neither knows nor cares what
> interpretation the client may attach to them.
> What I meant was for String like "façade" or "fad", it uses utf-8 encoding
> scheme to create those byte arrays (and therefore you can store non ascii
> values too, though they will vary from 1-4 bytes in size but as an end user,
> you don't care about that).
>
> The UTF-8 standard mentions that the byte-value lexicographic sorting order
> of UTF-8 strings matches the sorting order of the Unicode character numbers,
> so a client can turn 16- or 32-bit Unicode strings into UTF-8 in order to
> use them as keys and they will sort the same way.  (Although the standard
> warns that "a sort order based on character numbers is almost never
> culturally valid.")
>
> On the plus side, that means you never have to worry about "What's the next
> character after ç?"  Just add 1.  But don't be surprised when "fad" comes
> before "façade" in your sort.
>
> yes, no need to do any hard coding. Just add 1 to the last byte of the byte
array that is formed from the prefix of the key that you want to search.

Hope this is not that confusing now. :)

> joe
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB