Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> How to config hbase0.94.2 to retain deleted data


Copy link to this message
-
Re: How to config hbase0.94.2 to retain deleted data
Not sure that you can change the Table or Column Descriptors this way through a coprocessor.
Did you try to create (or alter) the table such that keepDeleteCells is true:

hbase(main):026:0> create 'usertable', {NAME=>'cf', KEEP_DELETED_CELLS=>true}
0 row(s) in 1.1660 seconds

hbase(main):027:0> put 'usertable', "key1", 'cf:c1', "v1", 99
0 row(s) in 0.0320 seconds

hbase(main):028:0> delete 'usertable', "key1", 'cf:c1', 100
0 row(s) in 0.0050 seconds

hbase(main):029:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP=> 99, VERSIONS => 4}
COLUMN                CELL                                                    
 cf:c1                timestamp=99, value=v1                                  
1 row(s) in 0.0150 seconds

Let me know how this works for you (generally). This is a new feature I added to 0.94 to support true time-range queries.

-- Lars
----- Original Message -----
From: yun peng <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc:
Sent: Sunday, October 21, 2012 1:53 PM
Subject: How to config hbase0.94.2 to retain deleted data

Hi, All,
I want to retain all deleted key-value pairs in hbase. I have tried to
config HColumnDescript as follow to make it return deleted.

  public void postOpen(ObserverContext<RegionCoprocessorEnvironment> e) {
    HTableDescriptor htd = e.getEnvironment().getRegion().getTableDesc();
    HColumnDescriptor hcd = htd.getFamily(Bytes.toBytes("cf"));
    hcd.setKeepDeletedCells(true);
    hcd.setBlockCacheEnabled(false);
  }

However, it does not work for me, as when I issued a delete and then query
by an older timestamp, the old data does not show up.

hbase(main):119:0> put 'usertable', "key1", 'cf:c1', "v1", 99
hbase(main):120:0> put 'usertable', "key1", 'cf:c1', "v2", 101
hbase(main):121:0> delete 'usertable', "key1", 'cf:c1', 100
hbase(main):122:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
=> 99, VERSIONS => 4}
COLUMN                CELL

0 row(s) in 0.0040 seconds

hbase(main):123:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
=> 100, VERSIONS => 4}
COLUMN                CELL

0 row(s) in 0.0050 seconds

hbase(main):124:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
=> 101, VERSIONS => 4}
COLUMN                CELL

cf:c1                timestamp=101, value=v2

1 row(s) in 0.0050 seconds

Note this is a new feature in 0.94.2
(HBASE-4536<https://issues.apache.org/jira/browse/HBASE-4536>),
I did not find too many sample code online, so... any one here has
experience in using HBASE-4536. How should one config
hbase to enable this feature in hbase?

Thanks
Yun
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB