Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Tracking cardinality in Accumulo


Copy link to this message
-
Re: Tracking cardinality in Accumulo
What's the expected size of your unique key set? Thousands? Millions?
Billions?

You could probably use a table structure similar to
https://github.com/calrissian/accumulo-recipes/tree/master/store/metrics-storebut
just have it emit 1's instead of summing them.

I'm thinking maybe your mappings could be like this:
group=anything, type=NAME, name=John(etc...)

perhaps a ColumnQualifierGrouping iterator could be applied at scan time to
add up the cardinalities for the quals over the given time range being
scanned where cardinalities across different time units get aggregated
client side.
On Fri, May 16, 2014 at 5:19 PM, David Medinets <[EMAIL PROTECTED]>wrote: