Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Intersecting & OR iterators


Copy link to this message
-
Re: Intersecting & OR iterators
Corey,

Sure, your proposed solution should work very well. After finding a
document, if you can construct a Range that encompasses many documents,
it would be trivial to create some code to aggregate many documents
instead of just one.

Have you taken a look at the wikisearch example? It has the ability to
specify arbitrary boolean expressions, wrapping multiple Intersecting
and Or iterators. The wikisearch code is now stored in contrib, a
directory above trunk in subversion. A write-up Eric Newton composed -
http://accumulo.apache.org/example/wikisearch.html

- Josh

On 8/29/12 10:51 PM, Corey Nolet wrote:
> I've been using the intersecting iterator to give me server side AND
> intersections with Accumulo 1.4.0 and I'm currently in the process of
> upgrading to Accumulo 1.4.1. I see the IntersectingIterator has been
> deprecated and the IndexedDocIterator has taken it's place. If I'm
> reading through the examples correctly- I see that the
> IndexedDocIterator is forcing a schema that assumes your doc contents
> can all be mashed together into one data structure in the value of the
> index row (in my case, I've got a bunch of key/value pairs as the
> contents). What if I need this contents to be separated so I can apply
> cell level visibility to the query? Does it make sense to put a UUID
> to another index as the "contents" and then perform another lookup
> once after retrieving the intersection result? I've been looking all
> over the place for good examples of this schema so I admit that I
> could be missing some key things in my understanding while reading
> through the source code.
>
> Also, AND queries are nice to do on server side, but I need the
> ability to perform AND and OR queries in concert with one another. For
> example, let's say I want to find everyone who's name is Paul or who's
> name is Gary or who's name is Lee who has Brown hair? That would mean
> I need to look up everything where (name=Paul | name=Gary | name=Lee)
> & hairColor=Brown. Do I need to extend the IntersectingIterator or the
> IndexedDocIterator and making my own that will allow full query
> criteria as input?
>
>
>
> --
> Corey Nolet
> Senior Software Engineer
> TexelTek, inc.
> [Office] 301.880.7123
> [Cell] 410-903-2110
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB