-Re: HBase aggregate query
Jerry Lam 2012-09-11, 18:48
Can you implement this using a counter?
That is whenever you insert a row with the month(eventdate) and scene
combination, increment the associated counter by one. Note that if you have
a batch insert of N, you can increment the counter by N.
Then you can simply query the counter whenever you want the aggregated
On Tue, Sep 11, 2012 at 1:59 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> That's when you aggregate along a sorted dimension (prefix of the key),
> though. Right?
> Not sure how smart Hive is here, but if it needs to sort the data it will
> probably be slower than SQL Server for such a small data set.
> ----- Original Message -----
> From: James Taylor <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Sent: Monday, September 10, 2012 5:49 PM
> Subject: Re: HBase aggregate query
> iwannaplay games <funnlearnforkids@...> writes:
> > Hi ,
> > I want to run query like
> > select month(eventdate),scene,count(1),sum(timespent) from eventlog
> > group by month(eventdate),scene
> > in hbase.Through hive its taking a lot of time for 40 million
> > records.Do we have any syntax in hbase to find its result?In sql
> > server it takes around 9 minutes,How long it might take in hbase??
> > Regards
> > Prabhjot
> In our internal testing using server-side coprocessors for aggregation,
> found HBase can process these types of queries very quickly: ~10-12 seconds
> using a four node cluster. You need to chunk up and parallelize the work
> on the
> client side to get this kind of performance, though.