Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> HBase source

Flume is a event collection tool, means Flume poll a source or catch events. HBase is a database, and usually stores some kind of data in a schema (CF). You could write a custom source and do a scan on your tables, but really I see no sense in such a task. And a full table scan at HBase is really expensive.
What do you mean with reindexing? HBase has primary and secondary indexes (http://hbase.apache.org/book/secondary.indexes.html), which can be processed over filters. To integrate HBase into SolR, you can use one of the tools I mentioned in my post before or ask the SolR mailing lists.

- Alex

On Jul 24, 2013, at 11:29 AM, Flavio Pompermaier <[EMAIL PROTECTED]> wrote:

> I was thinking to reindex my data stored in HBase and Flume + SolrSink were perfect to this purpose (although I could obviously write a mapreduce job).
> Don't you think this could be a common scenario in which Flume could be useful?
> On Wed, Jul 24, 2013 at 11:08 AM, Alexander Alten-Lorenz <[EMAIL PROTECTED]> wrote:
> Hi,
> No. And from my perspective it doesn't make sense. I think you look for tools like https://github.com/Photobucket/Solbase or http://code.google.com/p/hbase-solr-dataimport/.
> - Alex
> On Jul 24, 2013, at 10:51 AM, Flavio Pompermaier <[EMAIL PROTECTED]> wrote:
> > Hi to all,
> > I'd like to read data from HBase and move it to Solr.
> > Is there an HBase source in Flume or something to read from it?
> >
> > Best,
> > Flavio
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF