Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Generic increments?


+
Niels Basjes 2013-02-27, 13:47
+
lars hofhansl 2013-02-28, 04:30
+
Niels Basjes 2013-02-28, 09:22
Copy link to this message
-
Re: Generic increments?
Nick Dimiduk 2013-03-04, 19:37
Hi Niels,

As Lars said, I would start by reading the code for Increment and Append.
HRegion#append(Append, Boolean) should be interesting for you.

-n

On Thu, Feb 28, 2013 at 1:22 AM, Niels Basjes <[EMAIL PROTECTED]> wrote:

> Do you have some suggestions on where i should start? Perhaps someone has
> already created some rough design on how this can be done.
>
>

> > We do not have such a facility as far as I know.
> > We have Increment/Append, and these work by locking the row, retrieving
> > the old value, storing the updated value, unlocking the row.
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Niels Basjes <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Sent: Wednesday, February 27, 2013 5:47 AM
> > Subject: Generic increments?
> >
> > Hi,
> >
> > Last year at a meetup I spoke with Lars George about the counters in
> hbase.
> > What I understood is that the counters are stored as increments (i.e.
> > increment without locking) and during compaction and querying a the
> > increments are aggregated into the actual value.
> >
> > So far I've examined the API and this seems to work as long as the value
> is
> > a long.
> >
> > Now incrementing longs is nice but I would like to do things like
> > - Calculating min, max
> > - Bloomfilters
> > - Average ( recording both the "count" and "sum" )
> > - Variance and Standard Deviation ( using
> >
> >
> http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
> > )
> >
> > All of those need more bytes of internal storage and need custom code for
> > storing, aggregating and querying.
> > Especially querying because perhaps I can ask several different questions
> > to a single byte[].
> > If I store both the count and the sum in a single byte[] then I can ask
> > getN(), getSum(), getAvg()
> >
> > Now my question to you guys is how I can implement such a more generic
> form
> > of "lock free increments" with user defined setters, getters and a custom
> > aggregator (used for both compacting and querying).
> > Perhaps there is an example on how to do this?
> >
> > --
> > Best regards / Met vriendelijke groeten,
> >
> > Niels Basjes
>
+
Niels Basjes 2013-03-04, 19:42