Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - RE: EXTERNAL: Re: Custom Iterators


Copy link to this message
-
Re: EXTERNAL: Re: Custom Iterators
Billie Rinaldi 2012-08-23, 16:07
On Wed, Aug 22, 2012 at 3:22 PM, Cardon, Tejay E <[EMAIL PROTECTED]>wrote:

>  Why do some iterators have so many constructors if the system will
> simply construct them from the default constructor?
>
> Some iterators (such as OrIterator) throw an exception if init is called.
> How do these iterators get constructed and initialized?
>

Some iterators are "system" iterators and the tserver uses their special
constructors directly. These iterators have been moved to the
iterators.system package as of 1.4. Prior to 1.4, some user iterators had
constructors for testing purposes, but we have since tried to move towards
testing them in the way they will be used, i.e. through the default
constructor and passing configuration in the init method. This has been
accompanied by the use of static methods to make configuring an
IteratorSetting easier.

Billie

> ****
>
> ** **
>
> If OrIterator can do what I’m asking for, how do I get it the “terms” and
> what format do they come in?  You mentioned JEXL expressions, but I haven’t
> seen anything about them in the documentation.****
>
> ** **
>
> ** **
>
> As for my statement about the OrIterator and multiple rows, the comments
> on the compareTo for OrIterator.TermSource state “If your implementation
> can have more than one row in a tablet, you must compare row key here
> first, then column qualifier.”  But the code does not do so.  It may be
> that I’m just not fully understanding the code, however.****
>
> ** **
>
> Finally, I’m actually trying to do something a little more complex than
> just what I described below.  This reply is already too long and had too
> many questions in it, but I’ll get more detail out after I have a better
> handle on how the iterator framework works.****
>
>
> Thanks,****
>
> Tejay****
>
> ** **
>
> *From:* William Slacum [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, August 22, 2012 3:00 PM
> *To:* [EMAIL PROTECTED]
> *Subject:* EXTERNAL: Re: Custom Iterators****
>
> ** **
>
> An or clause should be able to handle an enumeration of values, as that's
> supported in a JEXL expression. It would not, however, surprise me if those
> iterators could not handle multiple rows in a tablet. If you can reproduce
> that, please file a ticket. There will be a large update occurring to the
> Wiki example in the near future.
>
> Do you have any specific questions about how you should structure your
> iterator or the contract? Making a tutorial has been on my to do list, but
> we all know how to do lists end up...
>
> The big things to remember are:
>
> 1) The call order: Your iterator will be created via the default
> constructor, init() will be called, then seek(). After seek() is called,
> your iterator should have a top if there is data available. A client then
> can call hasTop(), getTopKey() and getTopValue() to check and retrieve data
> (similar to hasNext() and next()) and then next to advance the pointer.
>
> 2) Your iterator can be destroyed during a scan and then reconstructed,
> being passed in the last key returned to the client as the start of the
> range.
>
> 3) You can have multiple sources feed into a single iterator in a tree
> like fashion by clone()'ing the source passed in to init.****
>
> On Wed, Aug 22, 2012 at 1:41 PM, Cardon, Tejay E <[EMAIL PROTECTED]>
> wrote:****
>
> All,****
>
> I’m interested in writing a custom iterator, and I’ve been looking for
> documentation on how to do so.  Thus far, I’ve not been able to find
> anything beyond the java docs in SortedKeyValueIterator and a few other
> sub-classes.  A few of the examples use Iterators, but provide no real info
> on how to properly implement one.  Is there anywhere to find general
> guidance on the iterator stack?****
>
>  ****
>
> (If you’re interested)****
>
> Specifically, for those that are curious, I’m trying to implement
> something similar to the wikisearch example, but with some key
> differences.  In my case, I’ve got a file with various attributes that