Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Scan (Start Row, End Row) vs Scan (Row)


Copy link to this message
-
RE: Scan (Start Row, End Row) vs Scan (Row)
Yes.

> -----Original Message-----
> From: Peter Haidinyak [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, January 20, 2011 8:41 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Scan (Start Row, End Row) vs Scan (Row)
>
> Question, does HBase stop scanning after it hits the end row? I thought it
> does.
>
> Thanks
>
> -Pete
>
> -----Original Message-----
> From: Jonathan Gray [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, January 20, 2011 8:09 AM
> To: [EMAIL PROTECTED]
> Subject: RE: Scan (Start Row, End Row) vs Scan (Row)
>
> The best way to do this is as Friso describes, using the existing stopRow
> parameter in Scan.
>
> There is another way to do it with startRow + a filter.  There is a PrefixFilter
> which could be used here.  Looking at the code, it seems as though the
> PrefixFilter does an early out and stops the scan once passed the prefix.
>
> If not, you can wrap any filter in a WhileMatchFilter.  That wrapping filter will
> make it so once the underlying filter fails once, all further things will fail and
> the scan will early out.
>
> JG
>
> > -----Original Message-----
> > From: Friso van Vollenhoven [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, January 20, 2011 12:45 AM
> > To: <[EMAIL PROTECTED]>
> > Subject: Re: Scan (Start Row, End Row) vs Scan (Row)
> >
> > Performing a scan with
> >
> > start row = 20100809041500_abd
> > end row = 20100809041500_abe
> >
> > will give you just that. The end row is exclusive, so it will only
> > return rows with VAR1 = abd. You need to compute the 'abe' yourself,
> > though (which is basically taking 'abd' and increasing the right most
> > byte by 1 unless it's at max byte value, then set it to 0 and increase
> > the byte left to that by 1, etc.). There is no scan method that has 'starts
> with' semantics, AFAIK.
> >
> > See here:
> >
> http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/
> > hbase/client/Scan.html#Scan(byte[],
> >
> byte[])<http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache
> > / hadoop/hbase/client/Scan.html#Scan(byte%5B%5D,%20byte%5B%5D)>
> >
> >
> > Friso
> >
> >
> >
> >
> > On 20 jan 2011, at 09:22, Shuja Rehman wrote:
> >
> > Hi
> > Consider the following scenario.
> >
> > Row Key  Format = DATETIME_VAR1_VAR2 (where var1 and var2 have
> fixed
> > lengths)
> >
> > and example data could be
> >
> > 20100809041500_abc_xyz
> > 20100809041500_abc_xyw
> > 20100809041500_abc_xyc
> > *20100809041500_abd_xyz*
> > 20100809041500_abd_xyw
> > 20100809041500_abf_xyz
> > ...
> >
> > Now if i want to get the rows which only have this row key
> > 20100809041500_abd then is there anyway to achieve through scan
> > without using filter because if i use filter scan(startrow, filter)
> > where startrow="20100809041500_abd" then it will scan whole table from
> > start key to end of table. i want to just scan that part of table
> > which i require. So if there is any method like this
> >
> > scan(row)  where row ="20100809041500_abd"  and it just return the
> > following results
> >
> > 20100809041500_abd_xyz
> > 20100809041500_abd_xyw
> >
> > Kindly let me know whether it is achievable or not?
> > thnx
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > <http://pk.linkedin.com/in/shujamughal>