Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - ColumnPaginationFilter called twice per line?


Copy link to this message
-
Re: ColumnPaginationFilter called twice per line?
Varun Sharma 2013-10-18, 22:15
Not really - IIRC, ColumnPaginationFilter was broken prior to this fix - it
was doing some incorrect version counting and it had to do with the way
version tracking and filtering was entangled together (I forget the exact
issue).

Varun
On Fri, Oct 18, 2013 at 1:46 PM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Varun,
>
> Thanks for the pointer. Did you recall of any particular version related
> work around 5257? I found few simple ways to fix that, but I'm not 100%
> sure of the impacts on the other use cases. Also, I looked at the test
> cases and I think we should add more into then.
>
> Thanks,
>
> JM
>
> Le jeudi 17 octobre 2013, Varun Sharma a écrit :
>
> > There is some history in HBase 5257 - thats where the
> > INCLUDE_AND_SEEK_NEXT_COL is introduced.
> >
> >
> > On Wed, Oct 16, 2013 at 9:21 PM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED] <javascript:;>> wrote:
> >
> > > So. Here are more details.
> > >
> > > The "issue" is because ScanQueryMatcher returns
> INCLUDE_AND_SEEK_NEXT_COL
> > > and not INCLUDE for this specific case while it should be. We don't
> want
> > to
> > > seek for the next column. I still have some difficulties to understand
> > all
> > > what this code is doing but I will continue to take a look.
> > >
> > > JM
> > >
> > >
> > > 2013/10/16 Jean-Marc Spaggiari <[EMAIL PROTECTED]<javascript:;>>
> > >
> > > > Ok. Confirmed. It's called twice:
> > > >
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9990
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9990
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9989
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9989
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9988
> > > > 2013-10-16 18:45:40,819 INFO
> > > > org.apache.hadoop.hbase.filter.ColumnPaginationFilter: A 9988
> > > >
> > > > Method filterKeyValue is called twice per cell version. I will try to
> > > > figure why.
> > > >
> > > > JM
> > > >
> > > >
> > > >
> > > > 2013/10/16 Jean-Marc Spaggiari <[EMAIL PROTECTED]<javascript:;>
> > >
> > > >
> > > >> Is anyone using filters to filter version on a single row?
> > > >>
> > > >> I look at ColumnPaginationFilter code and it's clean and very small.
> > But
> > > >> on the client side, when I ask for the 100 first version of a
> > row/CF/C I
> > > >> only get the 50 first one. If I do a scan from the shell, I get the
> 10
> > > 000
> > > >> versions correctly. If I do a scan from the client without the
> > filter, I
> > > >> get te 10K versions.
> > > >>
> > > >> I tried with 0.94.12.
> > > >>
> > > >> So since it seems to not be related to the ColumnPaginationFilter
> > code,
> > > I
> > > >> will start to take a look on the RS side and see how it's called,
> but
> > > I'm
> > > >> wondering if anyone use that or have already seens that.
> > > >>
> > > >> Any pointer will be welcome too.
> > > >>
> > > >> JM
> > > >>
> > > >
> > > >
> > >
> >
>