Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> how to use CountingIterator to count records?


Copy link to this message
-
Re: how to use CountingIterator to count records?
You're kind of there. Essentially, you can think of your Scanner's
interactions with the TServers as a tree with a height of two. Your
Scanner is the "root" and its children are all of the TServers it
needs to interact with. Essentially, the operation you'd want to is
sum the number of records each of the children have.

In Accumulo terms, you can use something like a CountingIterator to
count the number of results on each TServer. You can then sum all of
those intermediate results to get a total count of results.

On Wed, Jun 6, 2012 at 10:39 AM, Hunter Provyn <[EMAIL PROTECTED]> wrote:
> I want to know the number of records a scanner has without actually getting
> the records from cloudbase.
> I've been looking at CountingIterator (1.3.4), which has a getCount()
> method.  However, I don't know how
> to access the instance to call getCount() on it because Cloudbase server
> just passes back the entries and doesn't expose the instance of the
> iterator.
>
> It is possible to use an AggregatingIterator to aggregate all entries into a
> single entry whose value is the number of entries.  But I was wondering if
> there was a better way that possibly makes use of the CountingIterator
> class.
>