Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - Re: HBase aggregate query


+
Doug Meil 2012-09-10, 15:21
+
James Taylor 2012-09-11, 00:49
+
lars hofhansl 2012-09-11, 17:59
+
Jerry Lam 2012-09-11, 18:48
Copy link to this message
-
Re: HBase aggregate query
James Taylor 2012-09-13, 19:13
No, there's no sorted dimension. This would be a full table scan over
40M rows. This assumes the following:
1) your regions are evenly distributed across a four node cluster
2) unique combinations of month * scene are small enough to fit into memory
3) you chunk it up on the client side and run the chunks in parallel
(and have a final merge phase on the client)
On 09/11/2012 10:59 AM, lars hofhansl wrote:
> That's when you aggregate along a sorted dimension (prefix of the key), though. Right?
> Not sure how smart Hive is here, but if it needs to sort the data it will probably be slower than SQL Server for such a small data set.
>
>
>
> ----- Original Message -----
> From: James Taylor<[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc:
> Sent: Monday, September 10, 2012 5:49 PM
> Subject: Re: HBase aggregate query
>
> iwannaplay games<funnlearnforkids@...>  writes:
>> Hi ,
>>
>> I want to run query like
>>
>> select month(eventdate),scene,count(1),sum(timespent) from eventlog
>> group by month(eventdate),scene
>>
>> in hbase.Through hive its taking a lot of time for 40 million
>> records.Do we have any syntax in hbase to find its result?In sql
>> server it takes around 9 minutes,How long it might take in hbase??
>>
>> Regards
>> Prabhjot
>>
>>
> Hi,
> In our internal testing using server-side coprocessors for aggregation, we've
> found HBase can process these types of queries very quickly: ~10-12 seconds
> using a four node cluster. You need to chunk up and parallelize the work on the
> client side to get this kind of performance, though.
> Regards,
>
> James
>
+
iwannaplay games 2012-09-10, 14:22
+
Srinivas Mupparapu 2012-09-10, 14:16