Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase returns old values even with max versions = 1


Copy link to this message
-
Re: HBase returns old values even with max versions = 1
Before the second get command was executed, was there compaction on server
side ?

You can find out by going to region server hosting row 'r1' and check
server log.

Cheers
On Sat, Dec 7, 2013 at 12:05 AM, Niels Basjes <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I have the desire to find the columns that have not been updated for more
> than a specific time period.
>
> So I want to do a scan against the columns with a timerange.
> The normal behavior of HBase is that you then get the latest value in that
> time range (which is not what I want).
>
> As far as I understand the way HBase should work is that if you set the
> maximum number of versions for the values in a column family to '1' it
> should retain only the last value that was put into the cell.
>
> What I found is different.
>
> If I do the following commands into the hbase shell
>
>     create 't1', {NAME => 'c1', VERSIONS => 1}
>     put 't1', 'r1', 'c1', 'One', 1000
>     put 't1', 'r1', 'c1', 'Two', 2000
>     put 't1', 'r1', 'c1', 'Three', 3000
>     get 't1', 'r1'
>     get 't1', 'r1' , {TIMERANGE => [0,1500]}
>
> the result is this:
>
>     get 't1', 'r1'
>     COLUMN                     CELL
>      c1:                       timestamp=3000, value=Three
>     1 row(s) in 0.0780 seconds
>
>     get 't1', 'r1' , {TIMERANGE => [0,1500]}
>     COLUMN                     CELL
>      c1:                       timestamp=1000, value=One
>     1 row(s) in 0.1390 seconds
>
> Why does the second query return a value even though I've set the max
> versions to only 1?
> I expect that it only 'knows' about the latest value ('Three') and thus
> should return an empty result in the above example.
> What is the correct way to obtain what I'm looking for?
>
> My current workaround is that I simply retrieve the latest value for all my
> columns and filter them in my application code.
>
> The HBase version I currently have installed here is HBase 0.94.6-cdh4.4.0
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB