Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> delete operation with timestamp


+
Yi Liang 2011-11-24, 07:38
Copy link to this message
-
Re: delete operation with timestamp

On Nov 24, 2011, at 08:38 , Yi Liang wrote:

> We're using hbase-0.90.3 with thrift client, and have encountered some
> problems when we want to delete one specific version of a cell.
>
> First, there's no corresponding thrift api for Delete#deleteColumn(byte []
> family, byte [] qualifier, long timestamp). Instead, deleteColumns is
> supported in mutateRowTs.  But what we want is deleteColumn as we need to
> keep the older versions. IMO, we should implement mutateRowTs
> with deleteColumn, rather than deleteColumns. The hbase shell's delete
> command has the same problem.
>
> Second, we find we can't reinsert any older cell if we have deleted that
> cell with deleteColumns. For example:
> hbase(main):007:0> scan 'test3'
> ROW                                           COLUMN+CELL
> 0 row(s) in 0.0110 seconds
>
> hbase(main):008:0> put 'test3', 'r1', 'f1:c1', 'old', 1315550678308
> 0 row(s) in 0.0100 seconds
>
> hbase(main):009:0> scan 'test3'
> ROW                                           COLUMN+CELL
> r1                                           column=f1:c1,
> timestamp=1315550678308, value=old
> 1 row(s) in 0.0290 seconds
>
> hbase(main):012:0> put 'test3', 'r1', 'f1:c1', 'new'
> 0 row(s) in 0.0090 seconds
>
> hbase(main):013:0> scan 'test3'
> ROW                                           COLUMN+CELL
> r1                                           column=f1:c1,
> timestamp=1322119570316, value=new
> 1 row(s) in 0.0140 seconds
>
> hbase(main):014:0> delete 'test3', 'r1', 'f1:c1', 1322119570316
> 0 row(s) in 0.0130 seconds
>
> hbase(main):015:0> scan 'test3'
> ROW                                           COLUMN+CELL
> 0 row(s) in 0.0120 seconds
>
> hbase(main):016:0> put 'test3', 'r1', 'f1:c1', 'old', 1315550678308
> 0 row(s) in 0.0090 seconds
>
> hbase(main):017:0> scan 'test3'
> ROW                                           COLUMN+CELL
> 0 row(s) in 0.0110 seconds
>
> There's no error message when we reinsert the old version, so we think it
> has succeeded, but actually it's not. It looks like a bug.
>
> What's your opinion?
>

Hi,

The second point is not a bug, it's how HBase is designed. Any delete (except deleteColumn) inserts a tombstone marker which masks any older value, so even if you insert later an older value it will be masked by the tombstone. You can see some nice examples here: http://outerthought.org/blog/417-ot.html

There is also a new feature in trunk that allows you to retrieve masked values through a "raw scan" or a get with a timeRange that excludes the delete: https://issues.apache.org/jira/browse/HBASE-4536

Daniel

> Thanks,
> Yi
+
Yi Liang 2011-11-25, 06:11
+
lars hofhansl 2011-11-28, 23:56
+
Shrijeet Paliwal 2011-11-29, 00:31
+
lars hofhansl 2011-11-29, 01:33
+
Shrijeet Paliwal 2011-11-29, 01:49
+
lars hofhansl 2011-11-29, 04:16
+
Shrijeet Paliwal 2011-11-29, 04:36
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB