Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Issue with column-counting filters accepting multiple versions of a column


Copy link to this message
-
Re: Issue with column-counting filters accepting multiple versions of a column
Filters are applied before the version counting is performed.
This is a frequent area of contention. If filters were applied after the version counting other folks would complain (and have complained - in the early days filter were in fact evaluated after the version counting - which is why it was changed) for other reasons.

Unless we allow a filter to declare whether it needs be run before or after the version counting, we will always have an unhappy party :(
(I started thinking about this in HBASE-5257 but abandoned that for lack of interest)
-- Lars

________________________________
 From: Andrew Olson <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Thursday, October 4, 2012 1:33 PM
Subject: Issue with column-counting filters accepting multiple versions of a column
 
It looks like the max version limit for a table or scanner is not applied
to disregard older versions, prior to counting columns within a
ColumnPaginationFilter or ColumnCountGetFilter. As a result, a Scan or Get
can ultimately retrieve fewer than the requested number of columns when
there is a sufficient number of existing columns to satisfy the request, if
multiple versions of a column have been added to a row.

A minimal test case demonstrating this behavior can be found here:
https://gist.github.com/3836132

The javadoc for Get mentions 'Only Filter.filterKeyValue(KeyValue) is
called AFTER all tests for ttl, column match, deletes and *max
versions*have been run.'; for these two filters this behavior does not
appear to be
true, as flattening of multiple versions appears to occur after the filter
has been applied.

Should this be considered a bug? If so, are there any possible workarounds
besides implementing and deploying a custom Filter class?

thanks,
Andrew
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB