Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - row count


+
Venkat 2013-04-17, 01:33
+
Josh Elser 2013-04-17, 01:52
+
Venkat 2013-04-17, 02:46
+
David Medinets 2013-04-17, 12:17
+
Venkat 2013-04-18, 02:48
+
Keith Turner 2013-04-17, 14:42
+
Venkat 2013-04-18, 02:40
Copy link to this message
-
Re: row count
David Medinets 2013-04-18, 01:43
Could you layer a scan time SummingCombiner on top of the
FirstEntryInRowIterator?
I don't know how to actually do this, but instinct says it should work and
significantly reduce the traffic back to the client.
On Wed, Apr 17, 2013 at 10:42 AM, Keith Turner <[EMAIL PROTECTED]> wrote:

> On Tue, Apr 16, 2013 at 9:33 PM, Venkat <[EMAIL PROTECTED]> wrote:
> > I am sure this question has been asked several times but I could not get
> to
> > the answer using usual searches - which iterator is the right one to
> count
> > the number of rows for a given value or a pattern of value ?
>
> Take a look at org.apache.accumulo.core.iterators.FirstEntryInRowIterator.
>  Does anyone know why this is not in the user iterator package?  Is
> there an issue with it?  This will bring back the first key/value for
> each row, then you could count those on the client side.   This will
> work for a range.  For a pattern, David's suggestion of the regex
> filter may be useful.   You could also look in the
> org.apache.accumulo.core.iterators.user.RowFilter.
>
> You could use FirstEntryInRowIterator and RegEx or RowFilter, but you
> would have to be careful about the order of the iterators.
>
> >
> > Venkat.
>