Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Re: HBase aggregate query


Copy link to this message
-
Re: HBase aggregate query
Jerry Lam 2012-09-11, 18:48
Hi Prabhjot:

Can you implement this using a counter?
That is whenever you insert a row with the month(eventdate) and scene
combination, increment the associated counter by one. Note that if you have
a batch insert of N, you can increment the counter by N.

Then you can simply query the counter whenever you want the aggregated
result.

HTH,

Jerry

On Tue, Sep 11, 2012 at 1:59 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> That's when you aggregate along a sorted dimension (prefix of the key),
> though. Right?
> Not sure how smart Hive is here, but if it needs to sort the data it will
> probably be slower than SQL Server for such a small data set.
>
>
>
> ----- Original Message -----
> From: James Taylor <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc:
> Sent: Monday, September 10, 2012 5:49 PM
> Subject: Re: HBase aggregate query
>
> iwannaplay games <funnlearnforkids@...> writes:
> >
> > Hi ,
> >
> > I want to run query like
> >
> > select month(eventdate),scene,count(1),sum(timespent) from eventlog
> > group by month(eventdate),scene
> >
> > in hbase.Through hive its taking a lot of time for 40 million
> > records.Do we have any syntax in hbase to find its result?In sql
> > server it takes around 9 minutes,How long it might take in hbase??
> >
> > Regards
> > Prabhjot
> >
> >
>
> Hi,
> In our internal testing using server-side coprocessors for aggregation,
> we've
> found HBase can process these types of queries very quickly: ~10-12 seconds
> using a four node cluster. You need to chunk up and parallelize the work
> on the
> client side to get this kind of performance, though.
> Regards,
>
> James
>