Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Read access pattern


+
ricla@... 2013-04-29, 15:03
+
Shahab Yunus 2013-04-29, 15:17
+
Jean-Marc Spaggiari 2013-04-29, 16:17
+
ricla@... 2013-04-29, 17:05
Copy link to this message
-
Re: Read access pattern
HBASE-4811 is what you should be looking for, but it's not even close
to be implemented yet...

One option will be to have 2 tables, each in a reserved order. So
scanning forward in each will give you the key just after which at the
end will give you the key before the and the after...

2013/4/29  <[EMAIL PROTECTED]>:
>
> Thanx for the quick answer.
>
>> For the next key, I think you can simply use your current key as your
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for
>> the same object.
> Right, this is basically easy.
>
>> First, if you know that you are storing data about every 10 seconds,
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
>> lines you will have until you find your current line, and keep the
>> last one.
>
> Actually it is impossible to know the timerange for which there will be a next entry
>
>>
>> Else, if you don't know, you will have to start with
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
>> might have to skip MANY lines before finding the right one. Do I don't
>> really recommend that.
>
> ouch, obviously not very efficient. I assume even with a filter ?
>> Message du 29/04/13 18:18
>> De : "Jean-Marc Spaggiari"
>> A : [EMAIL PROTECTED]
>> Copie à :
>> Objet : Re: Read access pattern
>>
>> Hum.
>>
>> For the next key, I think you can simply use your current key as your
>> scanner first key. You will then find the one which is just after.
>> Then you will have to verify the MD5 hash to make sure it's still for
>> the same object.
>>
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId)) +
>> String.format("%19d\n", (Long.MAX_VALUE - changeDate.getTime())));
>>
>> If you want to find the one just before, quickly, I see 2 options.
>>
>> First, if you know that you are storing data about every 10 seconds,
>> set the startRow with something like
>> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> (Long.MAX_VALUE - (changeDate.getTime() - 60000))) then ready the few
>> lines you will have until you find your current line, and keep the
>> last one.
>>
>> Else, if you don't know, you will have to start with
>> scan.setStartRow(getMD5AsHex(Bytes.toBytes(myObjectId))); but you
>> might have to skip MANY lines before finding the right one. Do I don't
>> really recommend that.
>>
>> JM
>>
>> 2013/4/29 Shahab Yunus :
>> > I think you cannot use the scanner simply to to a range scan here as your
>> > keys are not monotonically increasing. You need to apply logic to
>> > decode/reverse your mechanism that you have used to hash your keys at the
>> > time of writing. You might want to check out the SemaText library which
>> > does distributed scans and seem to handle the scenarios that you want to
>> > implement.
>> > http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
>> >
>> >
>> > On Mon, Apr 29, 2013 at 11:03 AM, wrote:
>> >
>> >> Hi,
>> >>
>> >> I have a rowkey defined by :
>> >> getMD5AsHex(Bytes.toBytes(myObjectId)) + String.format("%19d\n",
>> >> (Long.MAX_VALUE - changeDate.getTime()));
>> >>
>> >> How could I get the previous and next row for a given rowkey ?
>> >> For instance, I have the following ordered keys :
>> >>
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370673172227807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> >00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674987271807
>> >>
>> >> If I choose the rowkey :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468862807, what would be the
>> >> correct scan to get the previous and next key ?
>> >> Result would be :
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674468022807
>> >> 00003db1b6c1e7e7d2ece41ff2184f76*9223370674984237807
+
ricla@... 2013-04-30, 13:17
+
Asaf Mesika 2013-04-30, 05:49
+
ricla@... 2013-04-30, 14:58
+
Michael Segel 2013-04-30, 15:57
+
Shahab Yunus 2013-04-30, 16:17
+
James Taylor 2013-04-30, 16:40
+
Michael Segel 2013-04-30, 17:06
+
lars hofhansl 2013-05-01, 05:12
+
Michael Segel 2013-05-01, 14:14
+
Shahab Yunus 2013-05-01, 14:21
+
Naidu MS 2013-05-01, 07:25