Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - How to config hbase0.94.2 to retain deleted data


+
yun peng 2012-10-21, 20:53
+
Michael Segel 2012-10-21, 23:34
+
lars hofhansl 2012-10-22, 00:23
+
Michael Segel 2012-10-22, 01:56
Copy link to this message
-
Re: How to config hbase0.94.2 to retain deleted data
Michael Segel 2012-10-23, 04:18
>
> Curious, why do you think this is better than using the keep-deleted-cells feature?
> (It might well be, just curious)

Ok... so what exactly does this feature mean?

Suppose I have 500 rows within a region. I set this feature to be true.
I do a massive delete and there are only 50 rows left standing.

So if I do a count of the number of rows in the region, I see only 50, yet if I compact the table, its still full.

Granted I'm talking about rows and not cells, but the idea is the same. IMHO you're asking for more headaches that you solve.

KISS would suggest that moving deleted data in to a different table would yield better performance in the long run.
On Oct 21, 2012, at 7:23 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> That'd work too. Requires the regionservers to make remote updates to other regionservers, though. And you have to trap each and every change (Put, Delete, Increment, Append, RowMutations, etc)
>
>
> Curious, why do you think this is better than using the keep-deleted-cells feature?
> (It might well be, just curious)
>
>
> -- Lars
>
>
>
> ----- Original Message -----
> From: Michael Segel <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc:
> Sent: Sunday, October 21, 2012 4:34 PM
> Subject: Re: How to config hbase0.94.2 to retain deleted data
>
> I would suggest that you use your coprocessor to copy the data to a 'backup' table when you mark them for delete.
> Then as major compaction hits, the rows are deleted from the main table, but still reside undeleted in your delete table.
> Call it a history table.
>
>
> On Oct 21, 2012, at 3:53 PM, yun peng <[EMAIL PROTECTED]> wrote:
>
>> Hi, All,
>> I want to retain all deleted key-value pairs in hbase. I have tried to
>> config HColumnDescript as follow to make it return deleted.
>>
>>   public void postOpen(ObserverContext<RegionCoprocessorEnvironment> e) {
>>     HTableDescriptor htd = e.getEnvironment().getRegion().getTableDesc();
>>     HColumnDescriptor hcd = htd.getFamily(Bytes.toBytes("cf"));
>>     hcd.setKeepDeletedCells(true);
>>     hcd.setBlockCacheEnabled(false);
>>   }
>>
>> However, it does not work for me, as when I issued a delete and then query
>> by an older timestamp, the old data does not show up.
>>
>> hbase(main):119:0> put 'usertable', "key1", 'cf:c1', "v1", 99
>> hbase(main):120:0> put 'usertable', "key1", 'cf:c1', "v2", 101
>> hbase(main):121:0> delete 'usertable', "key1", 'cf:c1', 100
>> hbase(main):122:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 99, VERSIONS => 4}
>> COLUMN                CELL
>>
>> 0 row(s) in 0.0040 seconds
>>
>> hbase(main):123:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 100, VERSIONS => 4}
>> COLUMN                CELL
>>
>> 0 row(s) in 0.0050 seconds
>>
>> hbase(main):124:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 101, VERSIONS => 4}
>> COLUMN                CELL
>>
>> cf:c1                timestamp=101, value=v2
>>
>> 1 row(s) in 0.0050 seconds
>>
>> Note this is a new feature in 0.94.2
>> (HBASE-4536<https://issues.apache.org/jira/browse/HBASE-4536>),
>> I did not find too many sample code online, so... any one here has
>> experience in using HBASE-4536. How should one config
>> hbase to enable this feature in hbase?
>>
>> Thanks
>> Yun
>
+
lars hofhansl 2012-10-23, 05:22
+
Michael Segel 2012-10-23, 11:41
+
lars hofhansl 2012-10-23, 18:35
+
Michael Segel 2012-10-23, 18:40
+
lars hofhansl 2012-10-23, 18:47
+
Marcos Ortiz Valmaseda 2012-10-22, 02:12
+
lars hofhansl 2012-10-21, 23:04
+
yun peng 2012-10-22, 00:20
+
lars hofhansl 2012-10-22, 04:34
+
PG 2012-10-23, 22:01