Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Understanding scan behaviour


+
Mohit Anchlia 2013-03-28, 04:15
+
Ted Yu 2013-03-28, 04:22
+
ramkrishna vasudevan 2013-03-28, 04:23
+
ramkrishna vasudevan 2013-03-28, 04:23
+
Mohit Anchlia 2013-03-28, 14:38
+
Jean-Marc Spaggiari 2013-03-28, 14:53
+
Mohit Anchlia 2013-03-28, 15:17
+
Jean-Marc Spaggiari 2013-03-28, 15:26
+
Mohit Anchlia 2013-03-28, 16:02
+
Ted Yu 2013-03-28, 16:15
Copy link to this message
-
Re: Understanding scan behaviour
Could the prefix filter lead to full tablescan? In other words is
PrefixFilter applied after fetching the rows?

Another question I have is say I have row key abc and abd and I search for
row "abc", is it always guranteed to be the first key when returned from
scanned results? If so I can alway put a condition in the client app.

On Thu, Mar 28, 2013 at 9:15 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Take a look at the following in
> hbase-server/src/main/ruby/shell/commands/scan.rb
> (trunk)
>
>   hbase> scan 't1', {FILTER => "(PrefixFilter ('row2') AND
>     (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123,
> 456))"}
>
> Cheers
>
> On Thu, Mar 28, 2013 at 9:02 AM, Mohit Anchlia <[EMAIL PROTECTED]
> >wrote:
>
> > I see then I misunderstood the behaviour. My keys are id + timestamp so
> > that I can do a range type search. So what I really want is to return a
> row
> > where id matches the prefix. Is there a way to do this without having to
> > scan large amounts of data?
> >
> >
> >
> > On Thu, Mar 28, 2013 at 8:26 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Hi Mohit,
> > >
> > > "+" ascii code is 43
> > > "9" ascii code is 57.
> > >
> > > So "+9" is coming after "++". If you don't have any row with the exact
> > > key "+++++", HBase will look for the first one after this one. And in
> > > your case, it's +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF.
> > >
> > > JM
> > >
> > > 2013/3/28 Mohit Anchlia <[EMAIL PROTECTED]>:
> > > > My understanding is that the row key would start with +++++ for
> > instance.
> > > >
> > > > On Thu, Mar 28, 2013 at 7:53 AM, Jean-Marc Spaggiari <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > >> Hi Mohit,
> > > >>
> > > >> I see nothing wrong with the results below. What would I have
> > expected?
> > > >>
> > > >> JM
> > > >>
> > > >> 2013/3/28 Mohit Anchlia <[EMAIL PROTECTED]>:
> > > >>  > I am running 92.1 version and this is what happens.
> > > >> >
> > > >> >
> > > >> > hbase(main):003:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1,
> STARTROW
> > =>
> > > >> > 'sdw0'}
> > > >> > ROW                                                  COLUMN+CELL
> > > >> >  s\xC1\xEAR\xDF\xEA&\x89\x91\xFF\x1A^\xB6d\xF0\xEC\x
> > > >> > column=SID_T_MTX:\x00\x00Rc, timestamp=1363056261106,
> > > >> > value=PAGE\x09\x091363056252990\x09\x09/
> > > >> >  7F\xFF\xFE\xC2\xA3\x84Z\x7F
> > > >> >
> > > >> > 1 row(s) in 0.0450 seconds
> > > >> > hbase(main):004:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1,
> STARTROW
> > =>
> > > >> > '------'}
> > > >> > ROW                                                  COLUMN+CELL
> > > >> >  -\xA1\xAF>r\xBD\xE2L\x00\xCD*\xD7\xE8\xD6\x1Dk\x7F\
> > > >> > column=SID_T_MTX:\x00\x00hF, timestamp=1363384706714,
> > > >> > value=PAGE\x09239923973\x091363384698919\x09/
> > > >> >  xFF\xFE\xC2\x8F\xF0\xC1\xBF
> > > >> >   row(s) in 0.0500 seconds
> > > >> > hbase(main):005:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1,
> STARTROW
> > =>
> > > >> > '++++'}
> > > >> > ROW                                                  COLUMN+CELL
> > > >> >  +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF
> > > >> > column=SID_T_MTX:\x00\x00<2, timestamp=1364404155426,
> > > >> > value=PAGE\x09\x091364404145275\x09 \x09/
> > > >> >  E\xC2S-\x08\x1F
> > > >> > 1 row(s) in 0.0640 seconds
> > > >> > hbase(main):006:0>
> > > >> >
> > > >> >
> > > >> > On Wed, Mar 27, 2013 at 9:23 PM, ramkrishna vasudevan <
> > > >> > [EMAIL PROTECTED]> wrote:
> > > >> >
> > > >> >> Same question, same time :)
> > > >> >>
> > > >> >> Regards
> > > >> >> Ram
> > > >> >>
> > > >> >> On Thu, Mar 28, 2013 at 9:53 AM, ramkrishna vasudevan <
> > > >> >> [EMAIL PROTECTED]> wrote:
> > > >> >>
> > > >> >> > Could you give us some more insights on this?
> > > >> >> > So you mean when you set the row key as 'azzzaaa', though this
> > row
> > > >> does
> > > >> >> > not exist, the scanner returns some other row?  Or it is giving
> > > you a
>
+
Ted Yu 2013-03-28, 17:23
+
Li, Min 2013-03-29, 05:48
+
ramkrishna vasudevan 2013-03-29, 06:20
+
James Taylor 2013-03-29, 07:44
+
Mohit Anchlia 2013-03-29, 16:31
+
Asaf Mesika 2013-03-30, 13:55
+
Mohit Anchlia 2013-03-30, 15:25
+
Ted Yu 2013-03-30, 16:37
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB