|
|
-
Re: Using Accumulo To Calculate Seven Day Rolling AverageAdam Fuchs 2012-05-19, 01:42
You could use a combiner for values that match the same day, and then roll
off whole days. This could be used along with a scan-time combiner to do averages across multiple days. Alternatively, s/day/hour/g or s/day/minute/g. Exponentially weighted moving averages might also be cool to do in a combiner: http://en.wikipedia.org/wiki/Exponential_decay Cheers, Adam On Fri, May 18, 2012 at 9:21 PM, David Medinets <[EMAIL PROTECTED]>wrote: > I'm replying a little late but Combiners replace the original values. > Therefore, I don't think they can be used to calculate the kind of > rolling averages I am calculating. There are other kinds of moving > averages that don't depend historical data but frankly I don't > remember their names. > > On Thu, Apr 12, 2012 at 10:25 PM, Billie J Rinaldi > <[EMAIL PROTECTED]> wrote: > > You could alternatively use a Combiner like the following to calculate > the average (though I haven't tested this bit of code). You would > configure this as a scan-time iterator (either a persistent scan iterator > for the table, or attached to a particular Scanner) and would use the > STRING encoding type of the LongCombiner. Not that it would be necessarily > better to use a Combiner to average together 7 things, but I thought it > would make a good example. > > > > public class AveragingCombiner extends LongCombiner { > > @Override > > public Long typedReduce(Key key, Iterator<Long> iter) { > > long sum = 0; > > long count = 0; > > while (iter.hasNext()) { > > sum = safeAdd(sum, iter.next()); > > count++; > > } > > return sum/count; > > } > > } > > > > Billie > > > > > > ----- Original Message ----- > >> From: "David Medinets" <[EMAIL PROTECTED]> > >> To: [EMAIL PROTECTED] > >> Sent: Wednesday, April 11, 2012 10:59:46 PM > >> Subject: Using Accumulo To Calculate Seven Day Rolling Average > >> Thanks. Using this technique seems to work. I wrote a blog entry to > >> document it: > >> > >> Using Accumulo To Calculate Seven Day Rolling Average > >> > http://affy.blogspot.com/2012/04/using-accumulo-to-calculate-seven-day.html > >> > >> On Wed, Apr 11, 2012 at 2:20 PM, Adam Fuchs <[EMAIL PROTECTED]> > >> wrote: > >> > David, > >> > > >> > In case of continuing confusion, I think it's best if you ignore > >> > Bill's > >> > suggestion for now and heed Josh's advice. Bill's suggestion might > >> > be an > >> > optimization to look at later on, but your initial approach seems > >> > sound. > >> > > >> > Adam > >> > > >> > > >> > > >> > On Tue, Apr 10, 2012 at 10:52 PM, David Medinets > >> > <[EMAIL PROTECTED]> > >> > wrote: > >> >> > >> >> I thought there were issues associated with doing mutations inside > >> >> iterators? > >> >> > >> >> On Tue, Apr 10, 2012 at 10:35 PM, William Slacum > >> >> <[EMAIL PROTECTED]> > >> >> wrote: > >> >> > I don't think you'd necessarily need a an aggregator for that, > >> >> > although > >> >> > it doesn't seem like that's what you're doing here in the first > >> >> > place. > >> >> > Wouldn't it be easier to set a summation iterator that also keeps > >> >> > a count of > >> >> > of observations to do some server side math and then combine it > >> >> > all on the > >> >> > client? That way you can have a time series and to get weekly > >> >> > averages you > >> >> > just change your scan range. > >> >> > On Apr 10, 2012, at 10:16 PM, David Medinets wrote: > >> >> > > >> >> >> I'm still thinking about how to use accumulo to calculate weekly > >> >> >> moving averages. I thought that using the maxVersions settings > >> >> >> might > >> >> >> work to maintain the last 7 values. Then a program could simply > >> >> >> sum > >> >> >> the values of a given row. So this is what I did: > >> >> >> > >> >> >> bin/accumulo shell -u root -p password > >> >> >>> createtable rolling > >> >> >> rolling> config -t rolling -s > >> >> >> table.iterator.scan.vers.opt.maxVersions=7 > >> >> >> rolling> insert row cf cq 1 > >> >> >> rolling> insert row cf cq 2 |