Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - HBase source


Copy link to this message
-
Re: HBase source
Alexander Alten-Lorenz 2013-07-24, 10:06
Flume is a event collection tool, means Flume poll a source or catch events. HBase is a database, and usually stores some kind of data in a schema (CF). You could write a custom source and do a scan on your tables, but really I see no sense in such a task. And a full table scan at HBase is really expensive.
What do you mean with reindexing? HBase has primary and secondary indexes (http://hbase.apache.org/book/secondary.indexes.html), which can be processed over filters. To integrate HBase into SolR, you can use one of the tools I mentioned in my post before or ask the SolR mailing lists.

- Alex

On Jul 24, 2013, at 11:29 AM, Flavio Pompermaier <[EMAIL PROTECTED]> wrote:

> I was thinking to reindex my data stored in HBase and Flume + SolrSink were perfect to this purpose (although I could obviously write a mapreduce job).
> Don't you think this could be a common scenario in which Flume could be useful?
>
> On Wed, Jul 24, 2013 at 11:08 AM, Alexander Alten-Lorenz <[EMAIL PROTECTED]> wrote:
> Hi,
>
> No. And from my perspective it doesn't make sense. I think you look for tools like https://github.com/Photobucket/Solbase or http://code.google.com/p/hbase-solr-dataimport/.
>
> - Alex
>
> On Jul 24, 2013, at 10:51 AM, Flavio Pompermaier <[EMAIL PROTECTED]> wrote:
>
> > Hi to all,
> > I'd like to read data from HBase and move it to Solr.
> > Is there an HBase source in Flume or something to read from it?
> >
> > Best,
> > Flavio
>
> --
> Alexander Alten-Lorenz
> http://mapredit.blogspot.com
> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>
>
>
>