Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Intersecting & OR iterators

Corey Nolet 2012-08-30, 02:51
Billie Rinaldi 2012-08-30, 13:01
Copy link to this message
Re: Intersecting & OR iterators

Sure, your proposed solution should work very well. After finding a
document, if you can construct a Range that encompasses many documents,
it would be trivial to create some code to aggregate many documents
instead of just one.

Have you taken a look at the wikisearch example? It has the ability to
specify arbitrary boolean expressions, wrapping multiple Intersecting
and Or iterators. The wikisearch code is now stored in contrib, a
directory above trunk in subversion. A write-up Eric Newton composed -

- Josh

On 8/29/12 10:51 PM, Corey Nolet wrote:
> I've been using the intersecting iterator to give me server side AND
> intersections with Accumulo 1.4.0 and I'm currently in the process of
> upgrading to Accumulo 1.4.1. I see the IntersectingIterator has been
> deprecated and the IndexedDocIterator has taken it's place. If I'm
> reading through the examples correctly- I see that the
> IndexedDocIterator is forcing a schema that assumes your doc contents
> can all be mashed together into one data structure in the value of the
> index row (in my case, I've got a bunch of key/value pairs as the
> contents). What if I need this contents to be separated so I can apply
> cell level visibility to the query? Does it make sense to put a UUID
> to another index as the "contents" and then perform another lookup
> once after retrieving the intersection result? I've been looking all
> over the place for good examples of this schema so I admit that I
> could be missing some key things in my understanding while reading
> through the source code.
> Also, AND queries are nice to do on server side, but I need the
> ability to perform AND and OR queries in concert with one another. For
> example, let's say I want to find everyone who's name is Paul or who's
> name is Gary or who's name is Lee who has Brown hair? That would mean
> I need to look up everything where (name=Paul | name=Gary | name=Lee)
> & hairColor=Brown. Do I need to extend the IntersectingIterator or the
> IndexedDocIterator and making my own that will allow full query
> criteria as input?
> --
> Corey Nolet
> Senior Software Engineer
> TexelTek, inc.
> [Office] 301.880.7123
> [Cell] 410-903-2110