Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Online/Realtime query with filter and join?


Copy link to this message
-
Re: Online/Realtime query with filter and join?
Doug Meil 2013-12-02, 18:58

You are going to want to figure out a rowkey (or a set of tables with
rowkeys) to restrict the number of I/O's. If you just slap Impala in front
of HBase (or even Phoenix, for that matter) you could write SQL against it
but if it's winds up doing a full-scan of an Hbase table underneath you
won't get your < 100ms response time.

Note:  I'm not saying you can't do this with Impala or Phoenix, I'm just
saying start with the rowkeys first so that you limit the I/O.  Then start
adding frameworks as needed (and/or build a schema with Phoenix in the
same rowkey exercise).

Such response-time requirements make me think that this is for application
support, so why the requirement for SQL? Might want to start writing it as
a Java program first.

On 11/29/13 4:32 PM, "Mourad K" <[EMAIL PROTECTED]> wrote:

>You might want to consider something like Impala or Phoenix, I presume
>you are trying to do some report query for dashboard or UI?
>MapReduce is certainly not adequate as there is too much latency on
>startup. If you want to give this a try, cdh4 and Impala are a good start.
>
>Mouradk
>
>On 29 Nov 2013, at 10:33, Ramon Wang <[EMAIL PROTECTED]> wrote:
>
>> The general performance requirement for each query is less than 100 ms,
>> that's the average level. Sounds crazy, but yes we need to find a way
>>for
>> it.
>>
>> Thanks
>> Ramon
>>
>>
>> On Fri, Nov 29, 2013 at 5:01 PM, yonghu <[EMAIL PROTECTED]> wrote:
>>
>>> The question is what you mean of "real-time". What is your performance
>>> request? In my opinion, I don't think the MapReduce is suitable for the
>>> real time data processing.
>>>
>>>
>>> On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu <[EMAIL PROTECTED]> wrote:
>>>
>>>> you can try phoniex.
>>>> On 2013-11-29 3:44 PM, "Ramon Wang" <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> Hi Folks
>>>>>
>>>>> It seems to be impossible, but I still want to check if there is a
>>>>>way
>>> we
>>>>> can do "complex" query on HBase with "Order By", "JOIN".. etc like we
>>>> have
>>>>> with normal RDBMS, we are asked to provided such a solution for it,
>>>>>any
>>>>> ideas? Thanks for your help.
>>>>>
>>>>> BTW, i think maybe impala from CDH would be a way to go, but haven't
>>> got
>>>>> time to check it yet.
>>>>>
>>>>> Thanks
>>>>> Ramon
>>>