|
|
Royston Sellman 2011-12-11, 18:52
I'm a newbie learning HBase using 0.90.4. Got my data bulk loading nicely into a cluster and now I want to have simple SQL-like aggregations (SUM, AVG, STD, MIN, MAX, MEDIAN, WEIGHTED MEDIAN etc) working. I started off trying to build MR code to do this but stumbled across AggregateProtocol and AggregateImplementation and AggregationClient in 0.94. Before I re-invent the wheel I'd like to check out this coprocessor aggregator functionality but I'm finding it a bit hard to get into.
It seems there has been a bit of discussion recommending that aggregations are done on the server and queried by client code. Looking at the code this seems to be the way it is architected in 0.94. Am I right about this? Is there a summary of the discussion anywhere?
I'm guessing I have to build 0.94 on my system to try the Aggregation coprocessor stuff out. Am I right or has it been backported to a release bundle? (I've never built HBase before, only used releases, it will take me a while to do a build and then put it on my cluster)
Is there any user documentation for the Aggregations stuff? I can't find any but don't know if I've looked in all the right places.
Some of the answers may be in the user mailing list. Is there an easy way to search this list? I tried GMANE and search-hadoop but didn't get much from either. Is reading the code my best chance?
Grateful for any pointers on this topic.
Royston
-
Re: Aggregations in HBase
Suraj Varma 2011-12-11, 19:56
Coprocessors are available with 0.92 which now has a release candidate (RC0).
So - you can probably try and build 0.92 RC0 to test this functionality out. --Suraj
On Sun, Dec 11, 2011 at 10:52 AM, Royston Sellman <[EMAIL PROTECTED]> wrote: > I'm a newbie learning HBase using 0.90.4. Got my data bulk loading nicely into a cluster and now I want to have simple SQL-like aggregations (SUM, AVG, STD, MIN, MAX, MEDIAN, WEIGHTED MEDIAN etc) working. I started off trying to build MR code to do this but stumbled across AggregateProtocol and AggregateImplementation and AggregationClient in 0.94. Before I re-invent the wheel I'd like to check out this coprocessor aggregator functionality but I'm finding it a bit hard to get into. > > It seems there has been a bit of discussion recommending that aggregations are done on the server and queried by client code. Looking at the code this seems to be the way it is architected in 0.94. Am I right about this? Is there a summary of the discussion anywhere? > > I'm guessing I have to build 0.94 on my system to try the Aggregation coprocessor stuff out. Am I right or has it been backported to a release bundle? (I've never built HBase before, only used releases, it will take me a while to do a build and then put it on my cluster) > > Is there any user documentation for the Aggregations stuff? I can't find any but don't know if I've looked in all the right places. > > Some of the answers may be in the user mailing list. Is there an easy way to search this list? I tried GMANE and search-hadoop but didn't get much from either. Is reading the code my best chance? > > Grateful for any pointers on this topic. > > Royston > > >
-
Re: Aggregations in HBase
Stack 2011-12-11, 22:28
On Sun, Dec 11, 2011 at 11:56 AM, Suraj Varma <[EMAIL PROTECTED]> wrote: > Coprocessors are available with 0.92 which now has a release candidate (RC0). > > So - you can probably try and build 0.92 RC0 to test this functionality out. > --Suraj >
What Suraj said. Look in the coprocessor package. There are some primitives that may be of use to you that already do some form of this.
St.Ack
-
Re: Aggregations in HBase
Royston Sellman 2011-12-12, 14:54
Thanks guys, I'll start with 0.92rc0 then.
Royston. On 11 Dec 2011, at 22:28, Stack wrote:
> On Sun, Dec 11, 2011 at 11:56 AM, Suraj Varma <[EMAIL PROTECTED]> wrote: >> Coprocessors are available with 0.92 which now has a release candidate (RC0). >> >> So - you can probably try and build 0.92 RC0 to test this functionality out. >> --Suraj >> > > What Suraj said. Look in the coprocessor package. There are some > primitives that may be of use to you that already do some form of > this. > > St.Ack
On Sun, Dec 11, 2011 at 10:52 AM, Royston Sellman <[EMAIL PROTECTED]> wrote: > I'm a newbie learning HBase using 0.90.4. Got my data bulk loading nicely into a cluster and now I want to have simple SQL-like aggregations (SUM, AVG, STD, MIN, MAX, MEDIAN, WEIGHTED MEDIAN etc) working. I started off trying to build MR code to do this but stumbled across AggregateProtocol and AggregateImplementation and AggregationClient in 0.94. Before I re-invent the wheel I'd like to check out this coprocessor aggregator functionality but I'm finding it a bit hard to get into. > > It seems there has been a bit of discussion recommending that aggregations are done on the server and queried by client code. Looking at the code this seems to be the way it is architected in 0.94. Am I right about this? Is there a summary of the discussion anywhere? > > I'm guessing I have to build 0.94 on my system to try the Aggregation coprocessor stuff out. Am I right or has it been backported to a release bundle? (I've never built HBase before, only used releases, it will take me a while to do a build and then put it on my cluster) > > Is there any user documentation for the Aggregations stuff? I can't find any but don't know if I've looked in all the right places. > > Some of the answers may be in the user mailing list. Is there an easy way to search this list? I tried GMANE and search-hadoop but didn't get much from either. Is reading the code my best chance? > > Grateful for any pointers on this topic. > > Royston > >
|
|