Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> flume fail over mechanism


Copy link to this message
-
Re: flume fail over mechanism
Hi,
1) If the source and sinks provided by the community are good enough for
you then don't invent yours. I think there are alot of work already done,
you can try those before writing your own source/sink.

2) For reliability you should be using file bases channel, which you
already plan to. For message failures I guess you need to handle it
yourself. You probably need to write some code there in order to catch the
failing message and routing them to some other channel, a more lenient one
like file or for the case of hbase you could write the whole failing
message in a different table.
You will also need to deal with duplicate messages if you are planning to
use a reliable channel (like file based channel).

/Ehsan
On Wed, Jan 15, 2014 at 10:43 AM, AnilKumar B <[EMAIL PROTECTED]> wrote:

> I am planning to use file based channels.
>
> Thanks & Regards,
> B Anil Kumar.
>
>
> On Wed, Jan 15, 2014 at 3:12 PM, AnilKumar B <[EMAIL PROTECTED]>wrote:
>
>> Hi,
>>
>> In our pipeline we are thinking of using flume, our data source can be
>> either filer or hbase or it can be couchbase also and sink is either
>> filer(down stream's) or another hbase cluster(down stream's).
>>
>> So I need some help in following.
>> 1) To handle multiple sources and sinks, do I need to write custom flume
>> sink and source? or I should use community's respective source and sinks?
>>
>> 2) For us, we cannot miss any data, Is there any mechanism in flume to
>> handle failed messages, I mean suppose flume failed write the records into
>> hbase, how exactly it will takes care? Or should I maintain state of each
>> record and based on it's state I am thinking of handling failed messages,
>> Is that correct way? I am trying to use zookeeper for state management. So
>> just want to know, whether my approach is correct or not.
>>
>>
>> Thanks & Regards,
>> B Anil Kumar.
>>
>
>
--
*Muhammad Ehsan ul Haque*
Klarna AB
Norra Stationsgatan 61
SE-113 43 Stockholm

Tel: +46 (0)8- 120 120 00
Fax: +46 (0)8- 120 120 99
Web: www.klarna.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB