Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Read access pattern

Copy link to this message
Re: Read access pattern

For the next key, I think you can simply use your current key as your
scanner first key. You will then find the one which is just after.
Then you will have to verify the MD5 hash to make sure it's still for
the same object.

scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));

If you want to find the one just before, quickly, I see 2 options.

First, if you know that you are storing data about every 10 seconds,
set the startRow with something like
getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
(Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
lines you will have until you find your current line, and keep the
last one.

Else, if you don't know, you will have to start with
scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
might have to skip MANY lines before finding the right one. Do I don't
really recommend that.


2013/4/29 Shahab Yunus <[EMAIL PROTECTED]>:
> I think you cannot use the scanner simply to to a range scan here as your
> keys are not monotonically increasing. You need to apply logic to
> decode/reverse your mechanism that you have used to hash your keys at the
> time of writing. You might want to check out the SemaText library which
> does distributed scans and seem to handle the scenarios that you want to
> implement.
> http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
> On Mon, Apr 29, 2013 at 11:03 AM, <[EMAIL PROTECTED]> wrote:
>> Hi,
>> I have a rowkey defined by :
>>         getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - changeDate.getTime()));
>> How could I get the previous and next row for a given rowkey ?
>> For instance, I have the following ordered keys :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>> If I choose the rowkey :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>> correct scan to get the previous and next key ?
>> Result would be :
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> Thank you !
>> R.
>> Une messagerie gratuite, garantie à vie et des services en plus, ça vous
>> tente ?
>> Je crée ma boîte mail www.laposte.net