Note: We did this early in 2011 but couldn’t talk about it for a while.
SOLR / Elastic Search would follow the same pattern.
Note that depending on what you’re indexing, the size of the index(s) could be larger than the base table by a couple of orders in magnitude.
If you wanted to tie SOLR to HBase for an in memory index you have a decision to make. Do you update the index data in hbase and have an eventual consistency model where it will take some time x (variable and measured in minutes to hours) before the data is available to the index, or do you want to update the data in memory and then persist to hbase.
We built the index and updated the index in HBase because we didn’t care about the eventual consistency. So we had to modify the flow of information.
If you are writing to SOLR directly, then SOLR has to persist in to HBase, and then you will have to deal with the issue of if SOLR isn’t available what do you do with the data? (Assume that you could on error write to HBase.)
Its definitely an option but you would also have to write the co-processor code to handle the index writes as you update the base table.
On Jul 16, 2014, at 5:51 AM, 张景鹏 <[EMAIL PROTECTED]> wrote:
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext