lars hofhansl 2013-02-23, 06:40
Andrew Purtell 2013-02-23, 17:41
lars hofhansl 2013-02-23, 20:39
Andrew Purtell 2013-02-23, 17:18
Stack 2013-02-26, 22:08
Jesse Yates 2013-02-26, 22:31
Andrew Purtell 2013-02-26, 23:27
Enis Söztutar 2013-02-27, 00:15
lars hofhansl 2013-02-27, 00:27
Jesse Yates 2013-02-27, 00:31
I filed HBASE-7958 <https://issues.apache.org/jira/browse/HBASE-7958> to
follow up on this. Includes a summary of the discussion so far.
On Tue, Feb 26, 2013 at 4:31 PM, Jesse Yates <[EMAIL PROTECTED]>wrote:
> The more I think about it, the more I'd like it in core. OSGi is something
> I'd like to avoid as long as we can, and baking this in makes (I think)
> more sense overall. This is especially true for how to deal with displaying
> the histograms in the UI - dependent CPs make me twitch.
> The things we would need to make this happen cleanly (IMO) would be:
> - system tables
> - basically metadata in the table descriptor that would hide it
> from the usual user queries like list_tables, etc. and expose something
> like deleteSystemTable
> - An extra 'stat' scanner that goes on top of the store scanner used
> for compaction that writes to the stats system table
> - CPs could still muck with this, but as always, that's at their
> own peril
> - Some pretty UI graphs on the master for the stats
> The debateable piece is then: pluggable? If so, to what degree?
> Something Lars just mentioned which would be nice is to have a Chore-like
> mechanism that lets people easily change the stats they want to keep track
> of. Probably along the lines of dynamic config, but since we can just push
> the changes into a waiting state element/queue-thingy and then let the next
> round of major compaction grab it without race concerns.
> Shall I file a JIRA (and sub-jiras) to get this into core; we can also
> take discussion there?
> Jesse Yates
> On Tue, Feb 26, 2013 at 4:27 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>> Just had a discussion with the Phoenix folks (my cubicle neighbors :) ).
>> Turns out that the types of problem we're trying to solve for Phoenix
>> would need equal-depth histograms, whereas for decisions such as picking a
>> 2ndary index equal-width histograms are often used.
>> So a key in this is a proper framework through, which, stats can hooked
>> up and calculated. OSGi for coprocessors would be nice, but may also be
>> overkill for this.
>> Maybe something like the chores framework would work.
>> In either case, there will be core stats (that would allow HBase to
>> decide between a scan and a multi get), and user defined stats to help
>> higher layers such as Phoenix, or an indexing library.
>> -- Lars
>> From: Enis Söztutar <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> Sent: Tuesday, February 26, 2013 4:15 PM
>> Subject: Re: Simple stastics per region
>> +1 for core. I can see that histograms might help us in automatic splits
>> and merges as well.
>> On Tue, Feb 26, 2013 at 3:27 PM, Andrew Purtell <[EMAIL PROTECTED]>
>> > If this is going to be a CP then other CPs need an easy way to use the
>> > output stats. If a subsequent proposal from core requires statistics
>> > this CP does that then mandate it itself must be a CP? What if that
>> > work?
>> > Putting the stats into a table addresses the first concern.
>> > For the second, it is an issue that comes up I think when building a
>> > generally useful shared function as a CP. Please consider inserting my
>> > earlier comments about OSGi here, in that we trend toward a real module
>> > system if we're not careful (unless that is the aim).
>> > On Tue, Feb 26, 2013 at 2:31 PM, Jesse Yates <[EMAIL PROTECTED]
>> > >wrote:
>> > > TL;DR Making it part of the UI and ensuring that you don't load things
>> > the
>> > > wrong way seem to be the only reasons for making this part of core -
>> > > certainly not bad reasons. They are fairly easy to handle as a CP
>> > > so maybe its not necessary immediately.
>> > >
>> > > I ended up writing a simple stats framework last week (ok, its like 6