Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - Finding my way through the forest of iterators


Copy link to this message
-
Re: Finding my way through the forest of iterators
Benson Margulies 2012-02-14, 16:39
Keith,

I think I'm good with the simple filtering stuff right now. I'm aiming to
make a second pass once 1.4 is released to revisit the problem of
aggregation across CQ instead of only on Value.

--benson
On Tue, Feb 14, 2012 at 11:23 AM, Keith Turner <[EMAIL PROTECTED]> wrote:

> The WholeRowIterator is one way to do this.  One drawback is that it
> reads entire rows into memory.
>
> If rows may not fit into memory there is an efficient way to handle
> this using two iterators.  One iterator creates another iterator that
> used to determine if rows contain the column, if not it seeks the
> original iterator over the row. If using the second iterator you
> determine the row contains the column, then you can read the row from
> the original iterator. This design allows you to efficiently return
> rows that meet a particular criteria w/o reading the rows into memory.
>  If you are interested in learning more I can point you to examples in
> the 1.4 code.
>
>
> On Tue, Feb 14, 2012 at 9:54 AM, Benson Margulies <[EMAIL PROTECTED]>
> wrote:
> > I'm working with 1.3.5, and there's not a ton of javadoc in
> > org.apache.accumulo.core.iterators.
> >
> > If I want to just filter to the rows that have a particular CF/CQ value,
> > does one of the existing iterator classes do the job, or do I need to
> write
> > one?
> >
>