Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Put w/ timestamp -> Deleteall -> Put w/ timestamp fails


Copy link to this message
-
Re: Put w/ timestamp -> Deleteall -> Put w/ timestamp fails
yonghu 2012-08-15, 11:48
Hi Harsh,

I have a question of your description. The deleted tag masks the new
inserted value with old timestamp, that's why the new inserted data
can'be seen. But after major compaction, this new value will be seen
again. So, the question is that how the deletion really executes. In
my understanding, the deletion will delete all the data values which
TSs are less equal than the TS of the deleted tag. So, if you insert a
value with old TS after you insert a deleted tag, it should also be
deleted at the  compaction time. For example, if I first insert
(k1,t1), and then delete  (k1,t1) with deleted tag which TS is greater
than t1, then reinsert (k1,t1) again. So, at the compaction time, two
(k1,t1) should be deleted.

wish your response!

Yong

On Wed, Aug 15, 2012 at 7:53 AM, Takahiko Kawasaki <[EMAIL PROTECTED]> wrote:
> Dear Harsh,
>
> Thank you very much for your detailed explanation. I could understand
> what had been going on during my put/scan/delete operations. I'll modify
> my application and test programs taking the timestamp implementation
> into consideration.
>
> Best Regards,
> Takahiko Kawasaki
>
> 2012/8/15 Harsh J <[EMAIL PROTECTED]>
>
>> When a Delete occurs, an insert is made with the timestamp being the
>> current time (to indicate it is the latest version). Hence, when you
>> insert a value after this with an _older_ timestamp, it is not taken
>> in as the latest version, and is hence ignored when scanning. This is
>> why you do not see the data.
>>
>> If you instead insert this after a compaction has fully run on this
>> store file, then your value will indeed get shown after insert, cause
>> at that moment there wouldn't exist such a row with a latest timestamp
>> at all.
>>
>> hbase(main):060:0> flush 'test-table'
>> 0 row(s) in 0.1020 seconds
>>
>> hbase(main):061:0> major_compact 'test-table'
>> 0 row(s) in 0.0400 seconds
>>
>> hbase(main):062:0> put 'test-table', 'row4', 'test-family', 'value', 10
>> 0 row(s) in 0.0230 seconds
>>
>> hbase(main):063:0> scan 'test-table'
>> ROW                   COLUMN+CELL
>>  row4                 column=test-family:, timestamp=10, value=value
>> 1 row(s) in 0.0060 seconds
>>
>> I suppose this is why it is recommended not to mess with the
>> timestamps manually, and instead just rely on versions.
>>
>> On Tue, Aug 14, 2012 at 8:24 PM, Takahiko Kawasaki <[EMAIL PROTECTED]>
>> wrote:
>> > Hello,
>> >
>> > I have a problem where 'put' with timestamp does not succeed.
>> > I did the following at the HBase shell.
>> >
>> > (1) Do 'put' with timestamp.
>> >       # 'scan' shows 1 row.
>> >
>> > (2) Delete the row by 'deleteall'.
>> >       # 'scan' says "0 row(s)".
>> >
>> > (3) Do 'put' again by the same command line as (1).
>> >       # 'scan' says "0 row(s)" ! Why?
>> >
>> > (4) Increment the timestamp value by 1 and try 'put' again.
>> >       # 'scan' still says "0 row(s)"! Why?
>> >
>> > The command lines I actually typed are as follows and the attached
>> > file is the output from the command lines.
>> >
>> > scan 'test-table'
>> > put 'test-table', 'row3', 'test-family', 'value'
>> > scan 'test-table'
>> > deleteall 'test-table', 'row3'
>> > scan 'test-table'
>> > put 'test-table', 'row3', 'test-family', 'value'
>> > scan 'test-table'
>> > deleteall 'test-table', 'row3'
>> > scan 'test-table'
>> > put 'test-table', 'row4', 'test-family', 'value', 10
>> > scan 'test-table'
>> > deleteall 'test-table', 'row4'
>> > scan 'test-table'
>> > put 'test-table', 'row4', 'test-family', 'value', 10
>> > scan 'test-table'
>> > put 'test-table', 'row4', 'test-family', 'value', 10
>> > scan 'test-table'
>> > quit
>> >
>> > Is this behavior the HBase specification?
>> >
>> > My cluster is built using CDH4 and the HBase version is 0.92.1-cdh4.0.0.
>> >
>> > Could anyone give me any insight, please?
>> >
>> > Best Regards,
>> > Takahiko Kawasaki
>>
>>
>>
>> --
>> Harsh J
>>