Graeme Wallace 2013-04-08, 18:23
Jean-Marc Spaggiari 2013-04-08, 18:27
Graeme Wallace 2013-04-08, 18:30
Jean-Marc Spaggiari 2013-04-08, 18:36
Graeme Wallace 2013-04-08, 18:39
Ted Yu 2013-04-08, 18:39
Graeme Wallace 2013-04-08, 19:10
Jean-Marc Spaggiari 2013-04-08, 20:31
Ted Yu 2013-04-08, 20:55
-Re: Best way to query multiple sets of rows
James Taylor 2013-04-08, 18:39
Are you familiar with Phoenix (https://github.com/forcedotcom/phoenix),
a SQL skin over HBase? We've just introduced a new feature (still in the
master branch) that'll do what you're looking for: transparently doing a
skip scan over the chunks of your HBase data based on your SQL query. It
leverages HBase's ability to have a filter return a "skip next" hint.
We've found it can make a pretty dramatic performance (50x), depending
on the cardinality of your data and the size of the chunks you're returning.
On 04/08/2013 11:30 AM, Graeme Wallace wrote:
> I thought a Scan could only cope with one start row and an end row ?
> On Mon, Apr 8, 2013 at 1:27 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]
>> Hi Greame,
>> The scans are the right way to do that.
>> They will give you back all the data you need, chunck by chunk. Then
>> yoiu have to iterate over the data to do what you want with it.
>> What was your expectation? I'm not sure I'm getting your "so that i
>> dont have to issue sequential Scans".
>> 2013/4/8 Graeme Wallace <[EMAIL PROTECTED]>:
>>> Maybe there is an obvious way but i'm not seeing it.
>>> I have a need to query HBase for multiple chunks of data, that is
>>> equivalent to
>>> select columns
>>> from table
>>> where rowid between A and B
>>> or rowid between C and D
>>> or rowid between E and F
>>> in SQL.
>>> Whats the best way to go about doing this so that i dont have to issue
>>> sequential Scans ?
>>> Graeme Wallace
>>> O: 972 588 1414
>>> M: 214 681 9018
Shixiaolong 2013-04-09, 03:00
lars hofhansl 2013-04-08, 21:37