Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> ResultCode.NEXT_ROW and scans with batching enabled


+
David Koch 2013-01-23, 00:13
+
Ted Yu 2013-01-23, 00:48
+
Anoop Sam John 2013-01-23, 03:44
Copy link to this message
-
Re: ResultCode.NEXT_ROW and scans with batching enabled
Hi guys,

Thank you for the explanations.

/David

On Wed, Jan 23, 2013 at 4:44 AM, Anoop Sam John <[EMAIL PROTECTED]> wrote:

> Hi,
>
> >In a scan, when a filter's filterKeyValue method returns
> >ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
> >next batch
>
> It will go to the new row.
>
> >In HBase 0.92
> > hasFilterRow has not been overridden for certain filters which
> effectively
> > do filter out rows (SingleColumnValueFilter for example).
>
> Yes this is an issue in old versions. It is fixed in trunk now.
>
> > I spent some time looking at HRegion.java to get to grips with how
> > filterRow works (or not) when batching is enabled.
>
> See the method RegionScannerImpl#nextInternal(int limit)  [In
> HRegion.java]. You can see a do while loop. This loop takes all the KVs for
> a row (and thus can be grouped as one Result). This one only checks for the
> batch size (limit)  When the filter says to go to next row, there will be a
> seek to the next row [As Ted said see the code in StoreScanner]. This will
> make the peekRow() return the next row key which is not same as the
> currentRow.. [Pls see the code]..  So this batch will end there and next
> batch will be KVs from next row only.
>
> -Anoop-
> ________________________________________
> From: Ted Yu [[EMAIL PROTECTED]]
> Sent: Wednesday, January 23, 2013 6:18 AM
> To: [EMAIL PROTECTED]
> Subject: Re: ResultCode.NEXT_ROW and scans with batching enabled
>
> Take a look at StoreScanner#next():
>
>         ScanQueryMatcher.MatchCode qcode = matcher.match(kv);
>
> ...
>
>           case SEEK_NEXT_ROW:
>
>             // This is just a relatively simple end of scan fix, to
> short-cut end
>
>             // us if there is an endKey in the scan.
>
>             if (!matcher.moreRowsMayExistAfter(kv)) {
>
>               return false;
>
>             }
>
>             reseek(matcher.getKeyForNextRow(kv));
>
>             break;
> Cheers
>
> On Tue, Jan 22, 2013 at 4:13 PM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > In a scan, when a filter's filterKeyValue method returns
> > ReturnCode.NEXT_ROW - does it actually skip to the next row or just the
> > next batch, provided of course batching is enabled? Where in the HBase
> > source code can I find out about this?
> >
> > I spent some time looking at HRegion.java to get to grips with how
> > filterRow works (or not) when batching is enabled. In HBase 0.92
> > hasFilterRow has not been overridden for certain filters which
> effectively
> > do filter out rows (SingleColumnValueFilter for example). Thus, these
> > filters do not generate a warning when used with a batched scan which -
> > while risky - provides the needed filtering in some cases. This has been
> > fixed for subsequent versions (at least 0.96) so I need to re-implement
> > custom filters which use this "effect".
> >
> > Thanks,
> >
> > /David
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB