Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> IteratorSetting and priorities


+
Patrone, Dennis S. 2012-10-30, 15:02
+
Billie Rinaldi 2012-10-31, 00:58
Copy link to this message
-
Re: IteratorSetting and priorities
The issue with giving multiple iterators the same priority is that the API
specifies that during the call to init(), one source is given the iterator.
Now, that iterator can make multiple copies of that source via deepCopy()
to make a tree of iterators, but by default its given one source.

In the absence of a more convenient API for tracking priorities, you could
create a Queue<IteratorSetting> and push the filters on you want on there,
and iteratively apply each IteratorSetting to the Scanner after you're done.

Personally, I have kicked the around the idea of client helpers that keep
track of priorities and provide queue or stack like interfaces to setting
up iterators. This doesn't solve the disparity between being able to create
trees of iterators on the server side versus only being able create a stack
on the client side.

On Tue, Oct 30, 2012 at 11:02 AM, Patrone, Dennis S. <
[EMAIL PROTECTED]> wrote:

> Hi all,
>
> Is there a reason that ScannerOptions only allows a single iterator per
> priority value?  It seems that multiple iterators added at the same
> priority could just be executed in an arbitrary order by the system.
>
> I have a ScannerBase that gets passed around through several classes.
>  These classes add different filters (for different reasons) to the scanner
> based on the particular request being processed and user configuration.
>  Requiring only one filter per priority imposes a dependency among the
> different classes managing the filters.  They have to coordinate to make
> sure no one reuses the same priority.
>
> I'd rather be able to set priorities based on the (expected) selectivity
> of the filter only within the class adding a subset of the filters, and let
> the cross-'domain' filtering priorities be managed automatically by
> Accumulo.
>
> Even worse, the ScannerBase API does not provide access to the
> already-added IteratorSettings or even the min/max iterator priority, so I
> have no way AFAICT to ensure via the API that my iterator priority is not
> in conflict with an existing priority.  I have to manage the priority value
> through an unenforceable convention... and wait for a RuntimeException(!)
> to tell me when the convention is violated.
>
> I think minimally an accessor method needs to be added so I can ensure my
> priority isn't going to clash and cause an IllegalArgumentException.
>
> Ideally, I'd like to see filters added at the same priority allowed and
> just executed in some arbitrary order (or some well-defined order within
> the priority, e.g., in order they were added?).
>
> I'd be willing to contribute some updates for this, but before I started I
> wanted to see if this is reasonable, if anyone else thinks it is a good
> idea, or if there are real valid reasons only one iterator per priority is
> allowed.
>
> Thanks,
> Dennis
>
>
> Dennis Patrone
> The Johns Hopkins University / Applied Physics Laboratory
> 240-228-2285 / Washington
> 443-778-2285 / Baltimore
> 443-220-7190 / Cell
> [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
>
>
+
Patrone, Dennis S. 2012-10-31, 11:52
+
William Slacum 2012-10-31, 13:40