Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Issue with column-counting filters accepting multiple versions of a column


Copy link to this message
-
Issue with column-counting filters accepting multiple versions of a column
Andrew Olson 2012-10-04, 20:33
It looks like the max version limit for a table or scanner is not applied
to disregard older versions, prior to counting columns within a
ColumnPaginationFilter or ColumnCountGetFilter. As a result, a Scan or Get
can ultimately retrieve fewer than the requested number of columns when
there is a sufficient number of existing columns to satisfy the request, if
multiple versions of a column have been added to a row.

A minimal test case demonstrating this behavior can be found here:
https://gist.github.com/3836132

The javadoc for Get mentions 'Only Filter.filterKeyValue(KeyValue) is
called AFTER all tests for ttl, column match, deletes and *max
versions*have been run.'; for these two filters this behavior does not
appear to be
true, as flattening of multiple versions appears to occur after the filter
has been applied.

Should this be considered a bug? If so, are there any possible workarounds
besides implementing and deploying a custom Filter class?

thanks,
Andrew