Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Custom Iterators


Copy link to this message
-
Re: Custom Iterators
An or clause should be able to handle an enumeration of values, as that's
supported in a JEXL expression. It would not, however, surprise me if those
iterators could not handle multiple rows in a tablet. If you can reproduce
that, please file a ticket. There will be a large update occurring to the
Wiki example in the near future.

Do you have any specific questions about how you should structure your
iterator or the contract? Making a tutorial has been on my to do list, but
we all know how to do lists end up...

The big things to remember are:

1) The call order: Your iterator will be created via the default
constructor, init() will be called, then seek(). After seek() is called,
your iterator should have a top if there is data available. A client then
can call hasTop(), getTopKey() and getTopValue() to check and retrieve data
(similar to hasNext() and next()) and then next to advance the pointer.

2) Your iterator can be destroyed during a scan and then reconstructed,
being passed in the last key returned to the client as the start of the
range.

3) You can have multiple sources feed into a single iterator in a tree like
fashion by clone()'ing the source passed in to init.

On Wed, Aug 22, 2012 at 1:41 PM, Cardon, Tejay E <[EMAIL PROTECTED]>wrote:

>  All,****
>
> I’m interested in writing a custom iterator, and I’ve been looking for
> documentation on how to do so.  Thus far, I’ve not been able to find
> anything beyond the java docs in SortedKeyValueIterator and a few other
> sub-classes.  A few of the examples use Iterators, but provide no real info
> on how to properly implement one.  Is there anywhere to find general
> guidance on the iterator stack?****
>
> ** **
>
> (If you’re interested)****
>
> Specifically, for those that are curious, I’m trying to implement
> something similar to the wikisearch example, but with some key
> differences.  In my case, I’ve got a file with various attributes that
> being indexed.  So for each file there are 5 attributes, and each attribute
> has a fixed number of possible values.  For example (totally made up):****
>
> personID, gender, hair color, country, race, personRecord****
>
> ** **
>
> Row:binID; ColFam:Attribute_AttributeValue; ColQ:PersonID; Val:blank****
>
> AND
> Row:binID; ColFam:”D”; ColQ:personID; value:personRecord****
>
> ** **
>
> A typical query would be:****
>
> Give me the personRecord for all people with:****
>
> Gender: male &****
>
> Hair color: blond or brown &****
>
> Country: USA or England or china or korea &****
>
> Race: white or oriental****
>
> ** **
>
> The existing Iterators used in the wikisearch example are unable to handle
> the “or” clauses in each attribute.****
>
> The OrIterator doesn’t appear to handle the possibility more than one row
> per tablet****
>
> ** **
>
> Thanks,****
>
> Tejay Cardon****
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB