Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scanning half a key or value in HBase


Copy link to this message
-
Re: Scanning half a key or value in HBase
This is another case, because this is not a prefix scan. This is
inclusive scan.
In this case of course you should use some techniques, like indexing
or full scan with filter.
But this is not a real time solution for any noticeable collections of
found keys
(you can achive ~4ms per record, and for example for 1000 rows you get 4 sec).

2010/8/24 Michelan Arendse <[EMAIL PROTECTED]>:
> This works wonderfully have a look at the code cause it's still a bit slow and I need it to be lighting fast.

> IndexedTable table = new IndexedTable(_hbManager.getConfiguration(), Bytes.toBytes("Table"));
> ResultScanner scanner = table.getIndexedScanner("IndexId", null,  null, null, filter,
>                new byte[][] {Bytes.toBytes("Colum_Family:column1")});

Under the hood IndexedTable perform Get for each found row in index,
so you can't
achive very fast index scans. Only denormalization can help.

>
> -----Original Message-----
> From: Andrey Stepachev [mailto:[EMAIL PROTECTED]]
> Sent: 23 August 2010 09:11 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Scanning half a key or value in HBase
>
> If my table is huge do i get full scan?
> I you want to get good performance on random read
> you really need start and stop keys.
> PrefixFIlters are usable in compound filters. If you want
> only one range (like 123_*), you must use start/stop keys.
>
> 2010/8/23 Samuru Jackson <[EMAIL PROTECTED]>:
>> Hi,
>>
>> I do it this way:
>>
>> The variable searchValue is my Prefix like in your case 123 would be:
>>
>> searchValue = "123";
>>
>> PrefixFilter prefixFilter = new PrefixFilter(Bytes.toBytes(searchValue));
>> Scan scan = new Scan();
>> scan.addFamily(Bytes.toBytes(this.REF_FAM));
>> scan.setFilter(prefixFilter);
>> ResultScanner resultScanner = hBaseTable.getScanner(scan);
>>
>> Now you can iterate over the resultScanner.
>>
>> Is this what you were looking for?
>>
>> /SJ
>>
>>
>>
>>
>> On Mon, Aug 23, 2010 at 6:00 AM, Michelan Arendse <[EMAIL PROTECTED]>
>> wrote:
>>> Hi,
>>>
>>> Thanks for the responses but it's still not what I am really looking for.
>>>
>>> The row id looks something like: number_string so it would be 123_foo,
>> 123_foo2 123_foo3.
>>> So now I want to find all the foo's that are related to the first half of
>> the key which is "123".
>>>
>>> Also I can't add start row if I do not know where 123 starts. And I can't
>> search for the start row, as I need this to be very fast.
>>>
>>> Thanks.
>>>
>>>
>>> -----Original Message-----
>>> From: Ryan Rawson [mailto:[EMAIL PROTECTED]]
>>> Sent: 17 August 2010 09:01 PM
>>> To: [EMAIL PROTECTED]
>>> Subject: Re: Scanning half a key or value in HBase
>>>
>>> Hey,
>>>
>>> One thing to watch out for is ascii with separator variable length
>>> keys, you would think if your key structure was:
>>>
>>> foo:bar
>>>
>>> starting at 'foo' and ending at 'foo:' might give you only keys which
>>> start with 'foo:' but this doesn't work like that.  You also get keys
>>> like:
>>> foo123:bar
>>>
>>> you must start the scan at 'foo:' but you can't just end it at 'foo;'
>>> (since next(:) == ';' in ascii), this has to do with the ordering of
>>> ASCII, for a reference look at http://www.asciitable.com/
>>>
>>> The bug-free solution is to start your scan at 'foo:' and use a prefix
>>> filter set to 'foo:'.
>>>
>>> If you are scanning fixed-width keys, eg: binary conversions of longs,
>>> then the [start,start+1) solution works.
>>>
>>> On Tue, Aug 17, 2010 at 5:59 AM, Andrey Stepachev <[EMAIL PROTECTED]>
>> wrote:
>>>> Use scan where start key is <first_half_of_key> itself as bytearray, and
>>>> stop key is <first_half_of_key> with last byte in bytearray + 1.
>>>>
>>>> example
>>>> abc% should be scan(abc, abd)
>>>>
>>>> 2010/8/17 Michelan Arendse <[EMAIL PROTECTED]>:
>>>>> Hi
>>>>>
>>>>> I am not sure if this is possible in HBase. What I am trying to do is
>> scan on a HBase table with something similar to how SQL would do it.
>>>>> e.g. SELECT *
>>>>>         FROM <table>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB