I'm curious as to what a 'good' approach would be for implementing
search in HBase (using Lucene) with the end goal being the integration
of realtime search into HBase. I think the use case makes sense as
HBase is realtime and has a write-ahead log, performs automatic
partitioning, splitting of data, failover, redundancy, etc. These are
all things Lucene does not have out of the box, that we'd essentially
get for 'free'.
For starters: Where would be the right place to store Lucene segments
or postings? Eg, we need to be able to efficiently perform a linear
iteration of the per-term posting list(s).