Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> ResultCode.NEXT_ROW and scans with batching enabled


Copy link to this message
-
Re: ResultCode.NEXT_ROW and scans with batching enabled
Hi guys,

Thank you for the explanations.

/David

On Wed, Jan 23, 2013 at 4:44 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:

> Hi,
>
> >In a scan, when a filter's filterKeyValue method returns
> >ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
> >next batch
>
> It will go to the new row.
>
> >In HBase 0.92
> > hasFilterRow has not been overridden for certain filters which
> effectively
> > do filter out rows (SingleColumnValueFilter for example).
>
> Yes this is an issue in old versions. It is fixed in trunk now.
>
> > I spent some time looking at HRegion.java to get to grips with how
> > filterRow works (or not) when batching is enabled.
>
> See the method RegionScannerImpl#nextInternal(int limit)  [In
> HRegion.java]. You can see a do while loop. This loop takes all the KVs for
> a row (and thus can be grouped as one Result). This one only checks for the
> batch size (limit)  When the filter says to go to next row, there will be a
> seek to the next row [As Ted said see the code in StoreScanner]. This will
> make the peekRow() return the next row key which is not same as the
> currentRow.. [Pls see the code]..  So this batch will end there and next
> batch will be KVs from next row only.
>
> -Anoop-
> ________________________________________
> From: Ted Yu [[EMAIL PROTECTED]]
> Sent: Wednesday, January 23, 2013 6:18 AM
> To: [EMAIL PROTECTED]
> Subject: Re: ResultCode.NEXT_ROW and scans with batching enabled
>
> Take a look at StoreScanner#next():
>
>         ScanQueryMatcher.MatchCode qcode = matcher.match(kv);
>
> ...
>
>           case SEEK_NEXT_ROW:
>
>             // This is just a relatively simple end of scan fix, to
> short-cut end
>
>             // us if there is an endKey in the scan.
>
>             if (!matcher.moreRowsMayExistAfter(kv)) {
>
>               return false;
>
>             }
>
>             reseek(matcher.getKeyForNextRow(kv));
>
>             break;
> Cheers
>
> On Tue, Jan 22, 2013 at 4:13 PM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > In a scan, when a filter's filterKeyValue method returns
> > ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
> > next batch, provided of course batching is enabled? Where in the HBase
> > source code can I find out about this?
> >
> > I spent some time looking at HRegion.java to get to grips with how
> > filterRow works (or not) when batching is enabled. In HBase 0.92
> > hasFilterRow has not been overridden for certain filters which
> effectively
> > do filter out rows (SingleColumnValueFilter for example). Thus, these
> > filters do not generate a warning when used with a batched scan which -
> > while risky - provides the needed filtering in some cases. This has been
> > fixed for subsequent versions (at least 0.96) so I need to re-implement
> > custom filters which use this "effect".
> >
> > Thanks,
> >
> > /David
> >
>