Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Online/Realtime query with filter and join?

Copy link to this message
Re: Online/Realtime query with filter and join?

You are going to want to figure out a rowkey (or a set of tables with
rowkeys) to restrict the number of I/O's. If you just slap Impala in front
of HBase (or even Phoenix, for that matter) you could write SQL against it
but if it's winds up doing a full-scan of an Hbase table underneath you
won't get your < 100ms response time.

Note:  I'm not saying you can't do this with Impala or Phoenix, I'm just
saying start with the rowkeys first so that you limit the I/O.  Then start
adding frameworks as needed (and/or build a schema with Phoenix in the
same rowkey exercise).

Such response-time requirements make me think that this is for application
support, so why the requirement for SQL? Might want to start writing it as
a Java program first.

On 11/29/13 4:32 PM, "Mourad K" <[EMAIL PROTECTED]> wrote:

>You might want to consider something like Impala or Phoenix, I presume
>you are trying to do some report query for dashboard or UI?
>MapReduce is certainly not adequate as there is too much latency on
>startup. If you want to give this a try, cdh4 and Impala are a good start.
>On 29 Nov 2013, at 10:33, Ramon Wang <[EMAIL PROTECTED]> wrote:
>> The general performance requirement for each query is less than 100 ms,
>> that's the average level. Sounds crazy, but yes we need to find a way
>> it.
>> Thanks
>> Ramon
>> On Fri, Nov 29, 2013 at 5:01 PM, yonghu <[EMAIL PROTECTED]> wrote:
>>> The question is what you mean of "real-time". What is your performance
>>> request? In my opinion, I don't think the MapReduce is suitable for the
>>> real time data processing.
>>> On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>>>> you can try phoniex.
>>>> On 2013-11-29 3:44 PM, "Ramon Wang" <[EMAIL PROTECTED]> wrote:
>>>>> Hi Folks
>>>>> It seems to be impossible, but I still want to check if there is a
>>> we
>>>>> can do "complex" query on HBase with "Order By", "JOIN".. etc like we
>>>> have
>>>>> with normal RDBMS, we are asked to provided such a solution for it,
>>>>> ideas? Thanks for your help.
>>>>> BTW, i think maybe impala from CDH would be a way to go, but haven't
>>> got
>>>>> time to check it yet.
>>>>> Thanks
>>>>> Ramon