Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> querying for relevant rows


Copy link to this message
-
Re: querying for relevant rows
Oh, did I interpret this wrong? I originally thought all of the timestamps
would be enumerated as rows, but after re-reading, I kind of get the idea
that the rows are being used as markers in a skip list like fashion.

On Fri, Jun 29, 2012 at 11:52 AM, Adam Fuchs <[EMAIL PROTECTED]> wrote:

> You can't scan backwards in Accumulo, but you probably don't need to. What
> you can do instead is use the last timestamp in the range as the key like
> this:
>
>     key=2  value= {a.1 b.1 c.2 d.2}
>     key=5  value= {m.3 n.4 o.5}
>     key=7  value={x.6 y.6 z.7}
>
> As long as your ranges are non-overlapping, you can just stop when you get
> to the first key/value pair that starts after your given time range. If
> your ranges are overlapping then you will have to do a more complicated
> intersection between forward and reverse orderings to efficiently select
> ranges, or maybe use some type of hierarchical range intersection index
> akin to a binary space partitioning tree.
>
> Cheers,
> Adam
>
>
>
> On Fri, Jun 29, 2012 at 2:19 PM, Lam <[EMAIL PROTECTED]> wrote:
>
>> I'm using a timestamp as a key and the value is all the relevant data
>> starting at that timestamp up to the timestamp represented by the key
>> of the next row.
>>
>> When querying, I'm given a time span, consisting of a start and stop
>> time.  I want to return all the relevant data within the time span, so
>> I was to retrieve the appropriate rows (then filter the data for the
>> given timespan).
>>
>> Example:
>> In Accumulo:  (the format of the value is  <letter>.<timestamp>)
>>     key=1  value= {a.1 b.1 c.2 d.2}
>>     key=3  value= {m.3 n.4 o.5}
>>     key=6  value={x.6 y.6 z.7}
>>
>> Query:  timespan=[2 4]  (get all data from timestamp 2 to 4 inclusively)
>>
>> Desire result: retrieve key=1 and key=3, then filter out a.1, b.1, and
>> o.5, and return the rest
>>
>> Problem: How do I know to retrieve key=1 and key=3 without scanning
>> all the keys?
>>
>> Can I create a scanner that looks for the given start key=2 and go to
>> the prior row (i.e. key=1)?
>>
>> --
>> D. Lam
>>
>
>