Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Behavior of Filter.transform() in FilterList?


Copy link to this message
-
Re: Behavior of Filter.transform() in FilterList?
On Mon, Jul 1, 2013 at 12:01 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> It would make sense, but it is not immediately clear how to do so cleanly.
> We would no longer be able to call transform at the StoreScanner level (or
> evaluate the filter multiple times, or require the filters to maintain
> their - last - state and only apply transform selectively).
>

I believe this change can be implemented directly in FilterList, without
requiring other changes.
A FilterList could compute its transformed KeyValue while applying
filterKeyValue() on the filter it contains, and return the pre-computed
transformed KeyValue in FilterList.transform() if it makes sense to do so.

This means Filter.transform() is always applied immediately after a
filterKeyValue() with a return code that includes the KeyValue, and this
would be true for all filters in the hierarchy.

C.

I added transform() a while ago in order to allow a Filter *not* to
> transform. Before each we defensively made a copy of the key, just in case
> a Filter (such as KeyOnlyFilter) would modify it, now this is a formalized,
> and the filter is responsible for making a copy only when needed.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: Christophe Taton <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; lars hofhansl <[EMAIL PROTECTED]>
> Sent: Monday, July 1, 2013 10:27 AM
> Subject: Re: Behavior of Filter.transform() in FilterList?
>
>
>
> On Mon, Jul 1, 2013 at 4:14 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
> You want transform to only be called on filters that are "reached"?
> >I.e. FilterA and FilterB, FilterB.transform should not be called if a KV
> is already filtered by FilterA?
> >
>
> Yes, that's what I naively expected, at first.
>
> That's not how it works right now, transform is called in a completely
> different code path from the actual filtering logic.
> >
>
> Indeed, I just learned that.
> I found no documentation of this behavior, did I miss it?
> In particular, the javadoc of the workflow of Filter doesn't mention
> transform() at all.
> Would it make sense to apply transform() only if the return code for
> filterKeyValue() includes the KeyValue?
>
> C.
>
> -- Lars
> >
> >
> >----- Original Message -----
> >From: Christophe Taton <[EMAIL PROTECTED]>
> >To: [EMAIL PROTECTED]
> >Cc:
> >Sent: Sunday, June 30, 2013 10:26 PM
> >Subject: Re: Behavior of Filter.transform() in FilterList?
> >
> >On Sun, Jun 30, 2013 at 10:15 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> >> The clause 'family=X and column=Y and KeyOnlyFilter' would be
> represented
> >> by a FilterList, right ?
> >> (family=A and colymn=B) would be represented by another FilterList.
> >>
> >
> >Yes, that would be FilterList(OR, [FilterList(AND, [family=X, column=Y,
> >KeyOnlyFilter]), FilterList(AND, [family=A, column=B])]).
> >
> >So the behavior is expected.
> >>
> >
> >Could you explain, I'm not sure how you reach this conclusion.
> >Are you saying it is expected, given the actual implementation
> >FilterList.transform()?
> >Or are there some other details I missed?
> >
> >Thanks!
> >C.
> >
> >On Mon, Jul 1, 2013 at 1:10 PM, Christophe Taton <[EMAIL PROTECTED]>
> wrote:
> >>
> >> > Hi,
> >> >
> >> > From
> >> >
> >> >
> >>
> https://github.com/apache/hbase/blob/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/filter/FilterList.java#L183
> >> > ,
> >> > it appears that Filter.transform() is invoked unconditionally on all
> >> > filters in a FilterList hierarchy.
> >> >
> >> > This is quite confusing, especially since I may construct a filter
> like:
> >> >     (family=X and column=Y and KeyOnlyFilter) or (family=A and
> colymn=B)
> >> > The KeyOnlyFilter will remove all values from the KeyValues in A:B as
> >> well.
> >> >
> >> > Is my understanding correct? Is this an expected/intended behavior?
> >> >
> >> > Thanks,
> >> > C.
> >> >
> >>
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB