Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How to config hbase0.94.2 to retain deleted data


Copy link to this message
-
Re: How to config hbase0.94.2 to retain deleted data
Lars,

Like the secondary indexes,  doing remote updates to other region servers isn't necessarily a bad thing.

There are ways to mitigate some of the costs of the update to the second table. I mean the actual update doesn't have to be synchronous.

HTH

-Mike

On Oct 21, 2012, at 7:23 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> That'd work too. Requires the regionservers to make remote updates to other regionservers, though. And you have to trap each and every change (Put, Delete, Increment, Append, RowMutations, etc)
>
>
> Curious, why do you think this is better than using the keep-deleted-cells feature?
> (It might well be, just curious)
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Michael Segel <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc:
> Sent: Sunday, October 21, 2012 4:34 PM
> Subject: Re: How to config hbase0.94.2 to retain deleted data
>
> I would suggest that you use your coprocessor to copy the data to a 'backup' table when you mark them for delete.
> Then as major compaction hits, the rows are deleted from the main table, but still reside undeleted in your delete table.
> Call it a history table.
>
>
> On Oct 21, 2012, at 3:53 PM, yun peng <[EMAIL PROTECTED]> wrote:
>
>> Hi, All,
>> I want to retain all deleted key-value pairs in hbase. I have tried to
>> config HColumnDescript as follow to make it return deleted.
>>
>>   public void postOpen(ObserverContext<RegionCoprocessorEnvironment> e) {
>>     HTableDescriptor htd = e.getEnvironment().getRegion().getTableDesc();
>>     HColumnDescriptor hcd = htd.getFamily(Bytes.toBytes("cf"));
>>     hcd.setKeepDeletedCells(true);
>>     hcd.setBlockCacheEnabled(false);
>>   }
>>
>> However, it does not work for me, as when I issued a delete and then query
>> by an older timestamp, the old data does not show up.
>>
>> hbase(main):119:0> put 'usertable', "key1", 'cf:c1', "v1", 99
>> hbase(main):120:0> put 'usertable', "key1", 'cf:c1', "v2", 101
>> hbase(main):121:0> delete 'usertable', "key1", 'cf:c1', 100
>> hbase(main):122:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 99, VERSIONS => 4}
>> COLUMN                CELL
>>
>> 0 row(s) in 0.0040 seconds
>>
>> hbase(main):123:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 100, VERSIONS => 4}
>> COLUMN                CELL
>>
>> 0 row(s) in 0.0050 seconds
>>
>> hbase(main):124:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 101, VERSIONS => 4}
>> COLUMN                CELL
>>
>> cf:c1                timestamp=101, value=v2
>>
>> 1 row(s) in 0.0050 seconds
>>
>> Note this is a new feature in 0.94.2
>> (HBASE-4536<https://issues.apache.org/jira/browse/HBASE-4536>),
>> I did not find too many sample code online, so... any one here has
>> experience in using HBASE-4536. How should one config
>> hbase to enable this feature in hbase?
>>
>> Thanks
>> Yun
>