Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Where is scanner startRow used

Copy link to this message
Re: Where is scanner startRow used
Yeah i just checked that we were already using startRow and its still
significantly poorer performance than the wide schema (close to unusable)

We are doing scans of 50 batch size but the scans are all over the place -
very random because the schema is tall and not wide. I have created a JIRA
for the same and I will report performance numbers there. But to me, not
seeking to the start row within a region feels clearly suboptimal.

On Wed, May 15, 2013 at 11:48 AM, Anoop John <[EMAIL PROTECTED]> wrote:

> At client side see ScannerCallable where this is passed to
> ServerCallable..  Based on this only which regions should be 1st scanned is
> decided..
> I think some time back also the prefix filter was discussed. At that time
> also the conclusion was to use the start row. U can set a start row now
> right?  Pls check the perf with this once.
> -Anoop-
> On Thu, May 16, 2013 at 12:02 AM, Varun Sharma <[EMAIL PROTECTED]>
> wrote:
> > Hi,
> >
> > Could someone please point me to where Scan.startRow is being used ?
> >
> > From what I can see in HRegion.RegionScannerImpl, it is unused. A grep
> does
> > not seem to return any valid entries. But my knowledge of this part is
> > limited.
> >
> > We are debugging poor performance on prefix scans in tall schemas. If
> this
> > is really an issue, I will open a JIRA...
> >
> > Varun
> >