Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Key Value collision


Copy link to this message
-
Re: Key Value collision
On Thu, May 16, 2013 at 11:49 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am wondering what happens when we add the following:
>
> row, col, timestamp --> v1
>
> A flush happens. Now, we add
>
> row, col, timestamp --> v2
>
> A flush happens again. In this case if MAX_VERSIONS == 1, how is the tie
> broken during reads and during minor compactions, is it arbitrary ?
>

See for what we have to say on
http://hbase.apache.org/book.html#versions I believe your question is
answered therein (Its what Michael says).

Here's a few tools in case you want to verify or interrogate its behavior
for yourself.

Start up a local instance.

Then start up a shell, create a table, insert a row then flush:

durruti:hbase-0.94.7 stack$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.94.7, r1471806, Wed Apr 24 18:48:26 PDT 2013

hbase(main):001:0> create 't', 'f'
2013-05-16 22:55:02.804 java[86479:1203] Unable to load realm info from
SCDynamicStore
0 row(s) in 5.5630 seconds

hbase(main):002:0> put 't', 'r', 'f:q', 'some value', 2
0 row(s) in 0.0800 seconds

hbase(main):003:0> flush 't'
0 row(s) in 5.4300 seconds
Check you have a flushed file w/ the expected content (My data in is in
default /tmp/hbase-USER dir):
durruti:hbase-0.94.7 stack$ ./bin/hbase
org.apache.hadoop.hbase.io.hfile.HFile --printkv -f
/tmp/hbase-stack/hbase/t/9820e76663df9e62807ecd88ed8e8588/f/8e7b62f748dc46aca8fb57f6fb153d90
13/05/16 22:57:02 INFO util.ChecksumType: Checksum can use
java.util.zip.CRC32
2013-05-16 22:57:02.467 java[86593:1203] Unable to load realm info from
SCDynamicStore
13/05/16 22:57:02 INFO hfile.CacheConfig: Allocating LruBlockCache with
maximum size 246.9m
13/05/16 22:57:02 ERROR metrics.SchemaMetrics: Inconsistent configuration.
Previous configuration for using table name in metrics: true, new
configuration: false
K: r/f:q/2/Put/vlen=10/ts=0 V: some value
Scanned kv count -> 1
Back to the shell, do another insert and flush w/ different value at same
timestamp of '2':
hbase(main):004:0> put 't', 'r', 'f:q', 'more recently written value', 2
0 row(s) in 0.0050 seconds

# Throw in a flush if you want...

hbase(main):011:0> get 't', 'r', {COLUMN => 'f:q', VERSIONS => 1}
COLUMN                                               CELL
 f:q                                                 timestamp=2,
value=more recently written value
1 row(s) in 0.0180 seconds
St.Ack
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB