Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?


Copy link to this message
-
Re: HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?
yuzhihong@... 2012-03-03, 02:52
Thanks Lars for explaining this.

Did you mean that expressing in German is easier :-)

On Mar 2, 2012, at 6:44 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> I finally looked into it. This is expected.
>
> Filters are executed (by a StoreScanner/ScanQueryMatcher) per store. We have a store per column family.
> The important part to observe here is that there is no intrinsic order between KeyValues that only differ in the column family, that is by design so that stores can be handled in parallel (even though we do not currently do that).
>
> The filter behaves as if every store is scanned in parallel. Each store starts in the beginning, and then each store needs to skip ahead using the filter.
> This is why it seemed to you that NEXT_ROW only seeks to the next column family, because you see the beginning of the scan for the next column family.
>
> Makes sense? It's a bit hard to explain in English :)
>
> -- Lars
>
> From: NNever <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; Ted Yu <[EMAIL PROTECTED]>
> Sent: Wednesday, February 22, 2012 7:20 PM
> Subject: Re: HBase0.92: In Filter, ReturnCode.NEXT_ROW may lead to next columnFamily but not next row?
>
> Thanks Ted, I don't know mailing list strips attachment before.
>
> Here is the attache:
>
> TestFilter.java:    http://pastebin.com/zC6EF8pX
> and the log:  http://pastebin.com/RsKJSHcn
>
> 2012/2/23 Ted Yu <[EMAIL PROTECTED]>
>
> > N:
> > Can you publish your code on pastebin or somewhere ?
> > Mailing list strips attachment.
> >
> > Thanks
> >
> >
> > On Tue, Feb 21, 2012 at 5:47 PM, NNever <[EMAIL PROTECTED]> wrote:
> >
> >> Attach is my test customFilter code --- TestFilter.
> >> It just simply extends FilterBase and do some system.out...
> >> You can just try any Table has more than one columnFamily like below:
> >>
> >> *Scan scan = new Scan();*
> >> *scan.setFilter(new TestFilter());*
> >> *hTable.getScanner(scan);*
> >>
> >> and look the HBase's log...
> >>
> >> It seems there is truely a BUG here....When filterKeyValue return
> >> ReturnCode.NEXT_ROW, it jump to next columnFamily but not next row...
> >> also there is one thing strange, why the fitlerRow() not be called?
> >>
> >> 2012/2/21 <[EMAIL PROTECTED]>
> >>
> >> The javadoc says filterRow() will still be called.
> >>>
> >>> Can you show us your filterRow() code ?
> >>>
> >>> Thanks
> >>>
> >>>
> >>>
> >>> On Feb 21, 2012, at 7:28 AM, NNever <[EMAIL PROTECTED]> wrote:
> >>>
> >>> > Hi~
> >>> >
> >>> > One customFilter,  Override filterKeyValue(KeyValue v).
> >>> > when the filter filterKeyValue a row's first keyValue, it will return
> >>> > "ReturnCode.NEXT_ROW" to jump to next row.
> >>> >
> >>> > But what infact is, the result changes when there are more than one
> >>> > columnFamily:(here are some logs)
> >>> >
> >>> > [filterRowKey] PERSONA1
> >>> > [filterKeyValue] family:info | qualifier:active | value:\x00
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [reset]
> >>> > [filterRowKey] PERSONA2
> >>> > [filterKeyValue] family:info | qualifier:active | value:\x00
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [reset]
> >>> > [filterRowKey] PERSONA3
> >>> > [filterKeyValue] family:info | qualifier:active | value:\x00
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [filterKeyValue] family:npo | qualifier:059201 | value:
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [reset]
> >>> > [filterRowKey] PERSONA4
> >>> > [filterKeyValue] family:cert | qualifier:certSN | value:
> >>> > PERSONAL4314120472582094317514215676313826416149
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [filterKeyValue] family:info | qualifier:active | value:\x00
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [filterKeyValue] family:npo | qualifier:059201 | value:
> >>> > [filterKeyValue] returnCode is NEXT_ROW
> >>> > [reset]
> >>> >
> >>> > the Table schema is
> >>> > User
> >>> > info:name, info:address, info:active.... (info family, every record has
> >>> > values)