Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scanning half a key or value in HBase


Copy link to this message
-
Re: Scanning half a key or value in HBase
If my table is huge do i get full scan?
I you want to get good performance on random read
you really need start and stop keys.
PrefixFIlters are usable in compound filters. If you want
only one range (like 123_*), you must use start/stop keys.

2010/8/23 Samuru Jackson <[EMAIL PROTECTED]>:
> Hi,
>
> I do it this way:
>
> The variable searchValue is my Prefix like in your case 123 would be:
>
> searchValue = "123";
>
> PrefixFilter prefixFilter = new PrefixFilter(Bytes.toBytes(searchValue));
> Scan scan = new Scan();
> scan.addFamily(Bytes.toBytes(this.REF_FAM));
> scan.setFilter(prefixFilter);
> ResultScanner resultScanner = hBaseTable.getScanner(scan);
>
> Now you can iterate over the resultScanner.
>
> Is this what you were looking for?
>
> /SJ
>
>
>
>
> On Mon, Aug 23, 2010 at 6:00 AM, Michelan Arendse <[EMAIL PROTECTED]>
> wrote:
>> Hi,
>>
>> Thanks for the responses but it's still not what I am really looking for.
>>
>> The row id looks something like: number_string so it would be 123_foo,
> 123_foo2 123_foo3.
>> So now I want to find all the foo's that are related to the first half of
> the key which is "123".
>>
>> Also I can't add start row if I do not know where 123 starts. And I can't
> search for the start row, as I need this to be very fast.
>>
>> Thanks.
>>
>>
>> -----Original Message-----
>> From: Ryan Rawson [mailto:[EMAIL PROTECTED]]
>> Sent: 17 August 2010 09:01 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: Scanning half a key or value in HBase
>>
>> Hey,
>>
>> One thing to watch out for is ascii with separator variable length
>> keys, you would think if your key structure was:
>>
>> foo:bar
>>
>> starting at 'foo' and ending at 'foo:' might give you only keys which
>> start with 'foo:' but this doesn't work like that.  You also get keys
>> like:
>> foo123:bar
>>
>> you must start the scan at 'foo:' but you can't just end it at 'foo;'
>> (since next(:) == ';' in ascii), this has to do with the ordering of
>> ASCII, for a reference look at http://www.asciitable.com/
>>
>> The bug-free solution is to start your scan at 'foo:' and use a prefix
>> filter set to 'foo:'.
>>
>> If you are scanning fixed-width keys, eg: binary conversions of longs,
>> then the [start,start+1) solution works.
>>
>> On Tue, Aug 17, 2010 at 5:59 AM, Andrey Stepachev <[EMAIL PROTECTED]>
> wrote:
>>> Use scan where start key is <first_half_of_key> itself as bytearray, and
>>> stop key is <first_half_of_key> with last byte in bytearray + 1.
>>>
>>> example
>>> abc% should be scan(abc, abd)
>>>
>>> 2010/8/17 Michelan Arendse <[EMAIL PROTECTED]>:
>>>> Hi
>>>>
>>>> I am not sure if this is possible in HBase. What I am trying to do is
> scan on a HBase table with something similar to how SQL would do it.
>>>> e.g. SELECT *
>>>>         FROM <table>
>>>>         WHERE <primary key> LIKE '<first_half_of_key>%' ;
>>>>
>>>> So as you can see from above I want to scan the table with only part of
> the row key, since the key is a combination of 2 fields in the table.
>>>>
>>>> Regards,
>>>> Michelan Arendse
>>>>
>>>>
>>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB