Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Simple stastics per region


+
lars hofhansl 2013-02-23, 06:40
+
Andrew Purtell 2013-02-23, 17:41
+
lars hofhansl 2013-02-23, 20:39
+
Andrew Purtell 2013-02-23, 17:18
+
Stack 2013-02-26, 22:08
+
Jesse Yates 2013-02-26, 22:31
+
Andrew Purtell 2013-02-26, 23:27
+
Enis Söztutar 2013-02-27, 00:15
+
lars hofhansl 2013-02-27, 00:27
+
Jesse Yates 2013-02-27, 00:31
Copy link to this message
-
Re: Simple stastics per region
I filed HBASE-7958 <https://issues.apache.org/jira/browse/HBASE-7958> to
follow up on this. Includes a summary of the discussion so far.

-------------------
Jesse Yates
@jesse_yates
jyates.github.com
On Tue, Feb 26, 2013 at 4:31 PM, Jesse Yates <[EMAIL PROTECTED]>wrote:

> The more I think about it, the more I'd like it in core. OSGi is something
> I'd like to avoid as long as we can, and baking this in makes (I think)
> more sense overall. This is especially true for how to deal with displaying
> the histograms in the UI - dependent CPs make me twitch.
>
> The things we would need to make this happen cleanly (IMO) would be:
>
>    - system tables
>       - basically metadata in the table descriptor that would hide it
>       from the usual user queries like list_tables, etc. and expose something
>       like deleteSystemTable
>    - An extra 'stat' scanner that goes on top of the store scanner used
>    for compaction that writes to the stats system table
>       - CPs could still muck with this, but as always, that's at their
>       own peril
>    - Some pretty UI graphs on the master for the stats
>
> The debateable piece is then: pluggable? If so, to what degree?
>
> Something Lars just mentioned which would be nice is to have a Chore-like
> mechanism that lets people easily change the stats they want to keep track
> of. Probably along the lines of dynamic config, but since we can just push
> the changes into a waiting state element/queue-thingy and then let the next
> round of major compaction grab it without race concerns.
>
> Shall I file a JIRA (and sub-jiras) to get this into core; we can also
> take discussion there?
> -------------------
> Jesse Yates
> @jesse_yates
> jyates.github.com
>
>
> On Tue, Feb 26, 2013 at 4:27 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
>
>> Just had a discussion with the Phoenix folks (my cubicle neighbors :) ).
>> Turns out that the types of problem we're trying to solve for Phoenix
>> would need equal-depth histograms, whereas for decisions such as picking a
>> 2ndary index equal-width histograms are often used.
>> So a key in this is a proper framework through, which, stats can hooked
>> up and calculated. OSGi for coprocessors would be nice, but may also be
>> overkill for this.
>> Maybe something like the chores framework would work.
>>
>> In either case, there will be core stats (that would allow HBase to
>> decide between a scan and a multi get), and user defined stats to help
>> higher layers such as Phoenix, or an indexing library.
>>
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Enis Söztutar <[EMAIL PROTECTED]>
>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>> Sent: Tuesday, February 26, 2013 4:15 PM
>> Subject: Re: Simple stastics per region
>>
>> +1 for core. I can see that histograms might help us in automatic splits
>> and merges as well.
>>
>>
>> On Tue, Feb 26, 2013 at 3:27 PM, Andrew Purtell <[EMAIL PROTECTED]>
>> wrote:
>>
>> > If this is going to be a CP then other CPs need an easy way to use the
>> > output stats. If a subsequent proposal from core requires statistics
>> from
>> > this CP does that then mandate it itself must be a CP? What if that
>> can't
>> > work?
>> >
>> > Putting the stats into a table addresses the first concern.
>> >
>> > For the second, it is an issue that comes up I think when building a
>> > generally useful shared function as a CP. Please consider inserting my
>> > earlier comments about OSGi here, in that we trend toward a real module
>> > system if we're not careful (unless that is the aim).
>> >
>> >
>> > On Tue, Feb 26, 2013 at 2:31 PM, Jesse Yates <[EMAIL PROTECTED]
>> > >wrote:
>> >
>> > > TL;DR Making it part of the UI and ensuring that you don't load things
>> > the
>> > > wrong way seem to be the only reasons for making this part of core -
>> > > certainly not bad reasons. They are fairly easy to handle as a CP
>> though,
>> > > so maybe its not necessary immediately.
>> > >
>> > > I ended up writing a simple stats framework last week (ok, its like 6
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB