Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Understanding scan behaviour


Copy link to this message
-
Re: Understanding scan behaviour
I see then I misunderstood the behaviour. My keys are id + timestamp so
that I can do a range type search. So what I really want is to return a row
where id matches the prefix. Is there a way to do this without having to
scan large amounts of data?

On Thu, Mar 28, 2013 at 8:26 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Mohit,
>
> "+" ascii code is 43
> "9" ascii code is 57.
>
> So "+9" is coming after "++". If you don't have any row with the exact
> key "+++++", HBase will look for the first one after this one. And in
> your case, it's +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF.
>
> JM
>
> 2013/3/28 Mohit Anchlia <[EMAIL PROTECTED]>:
> > My understanding is that the row key would start with +++++ for instance.
> >
> > On Thu, Mar 28, 2013 at 7:53 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi Mohit,
> >>
> >> I see nothing wrong with the results below. What would I have expected?
> >>
> >> JM
> >>
> >> 2013/3/28 Mohit Anchlia <[EMAIL PROTECTED]>:
> >>  > I am running 92.1 version and this is what happens.
> >> >
> >> >
> >> > hbase(main):003:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, STARTROW =>
> >> > 'sdw0'}
> >> > ROW                                                  COLUMN+CELL
> >> >  s\xC1\xEAR\xDF\xEA&\x89\x91\xFF\x1A^\xB6d\xF0\xEC\x
> >> > column=SID_T_MTX:\x00\x00Rc, timestamp=1363056261106,
> >> > value=PAGE\x09\x091363056252990\x09\x09/
> >> >  7F\xFF\xFE\xC2\xA3\x84Z\x7F
> >> >
> >> > 1 row(s) in 0.0450 seconds
> >> > hbase(main):004:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, STARTROW =>
> >> > '------'}
> >> > ROW                                                  COLUMN+CELL
> >> >  -\xA1\xAF>r\xBD\xE2L\x00\xCD*\xD7\xE8\xD6\x1Dk\x7F\
> >> > column=SID_T_MTX:\x00\x00hF, timestamp=1363384706714,
> >> > value=PAGE\x09239923973\x091363384698919\x09/
> >> >  xFF\xFE\xC2\x8F\xF0\xC1\xBF
> >> >   row(s) in 0.0500 seconds
> >> > hbase(main):005:0> scan 'SESSIONID_TIMELINE', {LIMIT => 1, STARTROW =>
> >> > '++++'}
> >> > ROW                                                  COLUMN+CELL
> >> >  +9hC\xFC\x82s\xABL3\xB3B\xC0\xF9\x87\x03\x7F\xFF\xF
> >> > column=SID_T_MTX:\x00\x00<2, timestamp=1364404155426,
> >> > value=PAGE\x09\x091364404145275\x09 \x09/
> >> >  E\xC2S-\x08\x1F
> >> > 1 row(s) in 0.0640 seconds
> >> > hbase(main):006:0>
> >> >
> >> >
> >> > On Wed, Mar 27, 2013 at 9:23 PM, ramkrishna vasudevan <
> >> > [EMAIL PROTECTED]> wrote:
> >> >
> >> >> Same question, same time :)
> >> >>
> >> >> Regards
> >> >> Ram
> >> >>
> >> >> On Thu, Mar 28, 2013 at 9:53 AM, ramkrishna vasudevan <
> >> >> [EMAIL PROTECTED]> wrote:
> >> >>
> >> >> > Could you give us some more insights on this?
> >> >> > So you mean when you set the row key as 'azzzaaa', though this row
> >> does
> >> >> > not exist, the scanner returns some other row?  Or it is giving
> you a
> >> row
> >> >> > that does not exist?
> >> >> >
> >> >> > Or you mean it is doing a full table scan?
> >> >> >
> >> >> > Which version of HBase and what type of filters are you using?
> >> >> > Regards
> >> >> > Ram
> >> >> >
> >> >> >
> >> >> > On Thu, Mar 28, 2013 at 9:45 AM, Mohit Anchlia <
> >> [EMAIL PROTECTED]
> >> >> >wrote:
> >> >> >
> >> >> >> I have key in the form of "hashedid + timestamp" but when I run
> scan
> >> I
> >> >> get
> >> >> >> rows for almost every value. For instance if I run scan for
> 'azzzaaa'
> >> >> that
> >> >> >> doesn't even exist even then I get the results.
> >> >> >>
> >> >> >> Could someone help me understand what might be going on here?
> >> >> >>
> >> >> >
> >> >> >
> >> >>
> >>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB