Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Intersecting & OR iterators


Copy link to this message
-
Intersecting & OR iterators
I've been using the intersecting iterator to give me server side AND
intersections with Accumulo 1.4.0 and I'm currently in the process of
upgrading to Accumulo 1.4.1. I see the IntersectingIterator has been
deprecated and the IndexedDocIterator has taken it's place. If I'm reading
through the examples correctly- I see that the IndexedDocIterator is
forcing a schema that assumes your doc contents can all be mashed together
into one data structure in the value of the index row (in my case, I've got
a bunch of key/value pairs as the contents). What if I need this contents
to be separated so I can apply cell level visibility to the query? Does it
make sense to put a UUID to another index as the "contents" and then
perform another lookup once after retrieving the intersection result? I've
been looking all over the place for good examples of this schema so I admit
that I could be missing some key things in my understanding while reading
through the source code.

Also, AND queries are nice to do on server side, but I need the ability to
perform AND and OR queries in concert with one another. For example, let's
say I want to find everyone who's name is Paul or who's name is Gary or
who's name is Lee who has Brown hair? That would mean I need to look up
everything where (name=Paul | name=Gary | name=Lee) & hairColor=Brown. Do I
need to extend the IntersectingIterator or the IndexedDocIterator and
making my own that will allow full query criteria as input?

--
Corey Nolet
Senior Software Engineer
TexelTek, inc.
[Office] 301.880.7123
[Cell] 410-903-2110