Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - grouping

Copy link to this message
Re: grouping
Benson Margulies 2012-02-11, 14:36
On Fri, Feb 10, 2012 at 7:43 PM, Billie J Rinaldi
> When you add an aggregator or combiner to a scan, it doesn't aggregate all the values in the scan range. It provides an aggregated value for each unique cell (i.e. row, column family, column qualifier, and column visibility tuple). It aggregates together values for keys that only differ by timestamp.
> If there is a VersioningIterator configured for the table (which there is by default), make sure to set the aggregator or combiner at a lower "priority" than the versioning, so that it occurs first -- or just remove the VersioningIterator from the table.

OK, I see, I have a different problem. I want to be able to control
the definition of a unique cell, and only aggregate values with the
same rowid and CF, not the CQ. So I guess I'll be doing my own
aggregation for the foreseeable unless I restructure the data.

> Billie