Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Behavior of Filter.transform() in FilterList?


Copy link to this message
-
Re: Behavior of Filter.transform() in FilterList?
Christophe Taton 2013-07-02, 02:17
On Mon, Jul 1, 2013 at 12:01 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> It would make sense, but it is not immediately clear how to do so cleanly.
> We would no longer be able to call transform at the StoreScanner level (or
> evaluate the filter multiple times, or require the filters to maintain
> their - last - state and only apply transform selectively).
>

I believe this change can be implemented directly in FilterList, without
requiring other changes.
A FilterList could compute its transformed KeyValue while applying
filterKeyValue() on the filter it contains, and return the pre-computed
transformed KeyValue in FilterList.transform() if it makes sense to do so.

This means Filter.transform() is always applied immediately after a
filterKeyValue() with a return code that includes the KeyValue, and this
would be true for all filters in the hierarchy.

C.

I added transform() a while ago in order to allow a Filter *not* to
> transform. Before each we defensively made a copy of the key, just in case
> a Filter (such as KeyOnlyFilter) would modify it, now this is a formalized,
> and the filter is responsible for making a copy only when needed.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Christophe Taton <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Monday, July 1, 2013 10:27 AM
> Subject: Re: Behavior of Filter.transform() in FilterList?
>
>
>
> On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
> You want transform to only be called on filters that are "reached"?
> >I.e. FilterA and FilterB, FilterB.transform should not be called if a KV
> is already filtered by FilterA?
> >
>
> Yes, that's what I naively expected, at first.
>
> That's not how it works right now, transform is called in a completely
> different code path from the actual filtering logic.
> >
>
> Indeed, I just learned that.
> I found no documentation of this behavior, did I miss it?
> In particular, the javadoc of the workflow of Filter doesn't mention
> transform() at all.
> Would it make sense to apply transform() only if the return code for
> filterKeyValue() includes the KeyValue?
>
> C.
>
> -- Lars
> >
> >
> >----- Original Message -----
> >From: Christophe Taton <[EMAIL PROTECTED]>
> >To: [EMAIL PROTECTED]
> >Cc:
> >Sent: Sunday, June 30, 2013 10:26 PM
> >Subject: Re: Behavior of Filter.transform() in FilterList?
> >
> >On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> >> The clause 'family=X and column=Y and KeyOnlyFilter' would be
> represented
> >> by a FilterList, right ?
> >> (family=A and colymn=B) would be represented by another FilterList.
> >>
> >
> >Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
> >KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
> >
> >So the behavior is expected.
> >>
> >
> >Could you explain, I'm not sure how you reach this conclusion.
> >Are you saying it is expected, given the actual implementation
> >FilterList.transform()?
> >Or are there some other details I missed?
> >
> >Thanks!
> >C.
> >
> >On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton <[EMAIL PROTECTED]>
> wrote:
> >>
> >> > Hi,
> >> >
> >> > From
> >> >
> >> >
> >>
> https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
> >> > ,
> >> > it appears that Filter.transform() is invoked unconditionally on all
> >> > filters in a FilterList hierarchy.
> >> >
> >> > This is quite confusing, especially since I may construct a filter
> like:
> >> >     (family=X and column=Y and KeyOnlyFilter) or (family=A and
> colymn=B)
> >> > The KeyOnlyFilter will remove all values from the KeyValues in A:B as
> >> well.
> >> >
> >> > Is my understanding correct? Is this an expected/intended behavior?
> >> >
> >> > Thanks,
> >> > C.
> >> >
> >>
> >
> >
>