Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Behavior of Filter.transform() in FilterList?


Copy link to this message
-
Re: Behavior of Filter.transform() in FilterList?
Christophe:
Looks like you have clear idea of what to do.

If you can show us in the form of patch, that would be nice.

Cheers

On Mon, Jul 1, 2013 at 7:17 PM, Christophe Taton <[EMAIL PROTECTED]> wrote:

> On Mon, Jul 1, 2013 at 12:01 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
> > It would make sense, but it is not immediately clear how to do so
> cleanly.
> > We would no longer be able to call transform at the StoreScanner level
> (or
> > evaluate the filter multiple times, or require the filters to maintain
> > their - last - state and only apply transform selectively).
> >
>
> I believe this change can be implemented directly in FilterList, without
> requiring other changes.
> A FilterList could compute its transformed KeyValue while applying
> filterKeyValue() on the filter it contains, and return the pre-computed
> transformed KeyValue in FilterList.transform() if it makes sense to do so.
>
> This means Filter.transform() is always applied immediately after a
> filterKeyValue() with a return code that includes the KeyValue, and this
> would be true for all filters in the hierarchy.
>
> C.
>
> I added transform() a while ago in order to allow a Filter *not* to
> > transform. Before each we defensively made a copy of the key, just in
> case
> > a Filter (such as KeyOnlyFilter) would modify it, now this is a
> formalized,
> > and the filter is responsible for making a copy only when needed.
> >
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Christophe Taton <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> > Sent: Monday, July 1, 2013 10:27 AM
> > Subject: Re: Behavior of Filter.transform() in FilterList?
> >
> >
> >
> > On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> >
> > You want transform to only be called on filters that are "reached"?
> > >I.e. FilterA and FilterB, FilterB.transform should not be called if a KV
> > is already filtered by FilterA?
> > >
> >
> > Yes, that's what I naively expected, at first.
> >
> > That's not how it works right now, transform is called in a completely
> > different code path from the actual filtering logic.
> > >
> >
> > Indeed, I just learned that.
> > I found no documentation of this behavior, did I miss it?
> > In particular, the javadoc of the workflow of Filter doesn't mention
> > transform() at all.
> > Would it make sense to apply transform() only if the return code for
> > filterKeyValue() includes the KeyValue?
> >
> > C.
> >
> > -- Lars
> > >
> > >
> > >----- Original Message -----
> > >From: Christophe Taton <[EMAIL PROTECTED]>
> > >To: [EMAIL PROTECTED]
> > >Cc:
> > >Sent: Sunday, June 30, 2013 10:26 PM
> > >Subject: Re: Behavior of Filter.transform() in FilterList?
> > >
> > >On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > >
> > >> The clause 'family=X and column=Y and KeyOnlyFilter' would be
> > represented
> > >> by a FilterList, right ?
> > >> (family=A and colymn=B) would be represented by another FilterList.
> > >>
> > >
> > >Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
> > >KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
> > >
> > >So the behavior is expected.
> > >>
> > >
> > >Could you explain, I'm not sure how you reach this conclusion.
> > >Are you saying it is expected, given the actual implementation
> > >FilterList.transform()?
> > >Or are there some other details I missed?
> > >
> > >Thanks!
> > >C.
> > >
> > >On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton <[EMAIL PROTECTED]>
> > wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > From
> > >> >
> > >> >
> > >>
> >
> https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
> > >> > ,
> > >> > it appears that Filter.transform() is invoked unconditionally on all
> > >> > filters in a FilterList hierarchy.
> > >> >
> > >> > This is quite confusing, especially since I may construct a filter
>