Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> HBase source


+
Flavio Pompermaier 2013-07-24, 08:51
+
Alexander Alten-Lorenz 2013-07-24, 09:08
+
Flavio Pompermaier 2013-07-24, 09:29
+
Alexander Alten-Lorenz 2013-07-24, 10:06
+
Flavio Pompermaier 2013-07-24, 10:19
Copy link to this message
-
Re: HBase source
Your task appears to be more of a periodic batch movement.. rather than
continuous streaming. Flume is meant for the latter use case.
-roshan
On Wed, Jul 24, 2013 at 3:19 AM, Flavio Pompermaier <[EMAIL PROTECTED]>wrote:

> In my use case I have a Solr index that proxy the access to data stored in
> HBase (I ask solr for the rowkey of documents matching some query).
> What I'd like to do is to be able to rebuild this solr index, reading the
> json or xml stored in each record, map fields to my solr document and
> commit.
> I know that this is not the main goal of Flume but I think it could be
> used also for this kind of task.
> I looked at the tools you suggested but they seems to be very small
> projects and they do not provide very interesting features like those in
> morphlines
> (correct me if I'm wrong!).
>
> Best,
> Flavio
>
>
> On Wed, Jul 24, 2013 at 12:06 PM, Alexander Alten-Lorenz <
> [EMAIL PROTECTED]> wrote:
>
>> Flume is a event collection tool, means Flume poll a source or catch
>> events. HBase is a database, and usually stores some kind of data in a
>> schema (CF). You could write a custom source and do a scan on your tables,
>> but really I see no sense in such a task. And a full table scan at HBase is
>> really expensive.
>> What do you mean with reindexing? HBase has primary and secondary indexes
>> (http://hbase.apache.org/book/secondary.indexes.html), which can be
>> processed over filters. To integrate HBase into SolR, you can use one of
>> the tools I mentioned in my post before or ask the SolR mailing lists.
>>
>> - Alex
>>
>> On Jul 24, 2013, at 11:29 AM, Flavio Pompermaier <[EMAIL PROTECTED]>
>> wrote:
>>
>> I was thinking to reindex my data stored in HBase and Flume + SolrSink
>> were perfect to this purpose (although I could obviously write a mapreduce
>> job).
>> Don't you think this could be a common scenario in which Flume could be
>> useful?
>>
>> On Wed, Jul 24, 2013 at 11:08 AM, Alexander Alten-Lorenz <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> No. And from my perspective it doesn't make sense. I think you look for
>>> tools like https://github.com/Photobucket/Solbase or
>>> http://code.google.com/p/hbase-solr-dataimport/.
>>>
>>> - Alex
>>>
>>> On Jul 24, 2013, at 10:51 AM, Flavio Pompermaier <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>> > Hi to all,
>>> > I'd like to read data from HBase and move it to Solr.
>>> > Is there an HBase source in Flume or something to read from it?
>>> >
>>> > Best,
>>> > Flavio
>>>
>>> --
>>> Alexander Alten-Lorenz
>>> http://mapredit.blogspot.com
>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>
>>
>>
>>
>>
>>
>>
>>