-Re: Filter with State
Jerry Lam 2012-08-02, 13:50
That is useful. I appreciate it. The idea about cross row transaction is an
Can I have an iterator on the client side that get rows from a coprocessor?
(i.e. Filtered rows are streamed into the client application and client can
access them via iterator)
On Thu, Aug 2, 2012 at 12:13 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> The Filter is initialized per Region as part of a RegionScannerImpl.
> So as long as all the rows you are interested are co-located in the same
> region you can keep that state in the Filter instance.
> You can use a custom RegionSplitPolicy to control (to some extend at
> least) how the rows are colocated (KeyPrefixRegionSplitPolicy is an
> I also blogged about this here (in the context of cross row transactions):
> Maybe what you really are looking for are coprocessors?
> -- Lars
> ----- Original Message -----
> From: Jerry Lam <[EMAIL PROTECTED]>
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Sent: Wednesday, August 1, 2012 7:06 PM
> Subject: Re: Filter with State
> Hi Lars,
> I understand that it is more difficult to carry states across
> regions/servers, how about in a single region? Knowing that the rows in a
> single region have dependencies, can we have filter with state? If filter
> doesn't provide this ability, is there other mechanism in hbase to offer
> this kind of functionalities?
> I think this is a good feature because it allows efficient scanning on
> dependent rows. Instead of fetching each row to the client side and check
> if we should fetch the next row, the filter on the server side handles this
> Best Regards,
> Sent from my iPad (sorry for spelling mistakes)
> On 2012-08-01, at 21:52, lars hofhansl <[EMAIL PROTECTED]> wrote:
> > The issue here is that different rows can be located in different
> regions or even different region servers, so no local state will carry over
> all rows.
> > ----- Original Message -----
> > From: Jerry Lam <[EMAIL PROTECTED]>
> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Sent: Wednesday, August 1, 2012 5:48 PM
> > Subject: Re: Filter with State
> > Hi St.Ack:
> > Schema cannot be changed to a single row.
> > The API describes "Do not rely on filters carrying state across rows;
> its not reliable in current hbase as we have no handlers in place for when
> regions split, close or server crashes." If we manage region splitting
> ourselves, so the split issue doesn't apply. Other failures can be handled
> on the application level. Does each invocation of scanner.next instantiate
> a new filter at the server side even on the same region (I.e. Does scanning
> on the same region use the same filter or different filter depending on the
> scanner.next calls??)
> > Best Regards,
> > Jerry
> > Sent from my iPad (sorry for spelling mistakes)
> > On 2012-08-01, at 18:44, Stack <[EMAIL PROTECTED]> wrote:
> >> On Wed, Aug 1, 2012 at 10:44 PM, Jerry Lam <[EMAIL PROTECTED]>
> >>> Hi HBase guru:
> >>> From Lars George talk, he mentions that filter has no state. What if I
> >>> to scan rows in which the decision to filter one row or not is based
> on the
> >>> previous row's column values? Any idea how one can implement this type
> >>> logic?
> >> You could try carrying state in the client (but if client dies state
> >> You can't have scanners carry state across rows. It says so in API
> >> (Whatever about the API, if LarsG says it, it must be so!).
> >> Here is the issue: If row X is in region A on server 1 there is
> >> nothing to prevent row X+1 from being on region B on server 2. How do