Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - How to query by rowKey-infix


Copy link to this message
-
Re: How to query by rowKey-infix
Jerry Lam 2012-07-31, 17:10
Hi Chris:

I'm thinking about building a secondary index for primary key lookup, then
query using the primary keys in parallel.

I'm interested to see if there is other option too.

Best Regards,

Jerry

On Tue, Jul 31, 2012 at 11:27 AM, Christian Schäfer <[EMAIL PROTECTED]>wrote:

> Hello there,
>
> I designed a row key for queries that need best performance (~100 ms)
> which looks like this:
>
> userId-date-sessionId
>
> These queries(scans) are always based on a userId and sometimes
> additionally on a date, too.
> That's no problem with the key above.
>
> However, another kind of queries shall be based on a given time range
> whereas the outermost left userId is not given or known.
> In this case I need to get all rows covering the given time range with
> their date to create a daily reporting.
>
> As I can't set wildcards at the beginning of a left-based index for the
> scan,
> I only see the possibility to scan the index of the whole table to collect
> the
> rowKeys that are inside the timerange I'm interested in.
>
> Is there a more elegant way to collect rows within time range X?
> (Unfortunately, the date attribute is not equal to the timestamp that is
> stored by hbase automatically.)
>
> Could/should one maybe leverage some kind of row key caching to accelerate
> the collection process?
> Is that covered by the block cache?
>
> Thanks in advance for any advice.
>
> regards
> Chris
>