Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Guarantees of the memory channel for delivering to sink


+
Rahul Ravindran 2012-11-06, 21:32
+
Brock Noland 2012-11-06, 21:38
+
Rahul Ravindran 2012-11-06, 21:43
+
Brock Noland 2012-11-06, 21:44
+
Rahul Ravindran 2012-11-06, 22:53
+
Brock Noland 2012-11-06, 23:05
+
Rahul Ravindran 2012-11-06, 23:40
+
Rahul Ravindran 2012-11-07, 19:29
+
Brock Noland 2012-11-07, 19:48
+
Rahul Ravindran 2012-11-07, 19:52
Copy link to this message
-
Re: Guarantees of the memory channel for delivering to sink
The memory channel doesn't know about networks.  The sources like
avrosource/avrosink do. They operate on TCP/IP and when there is an error
sending data downstream they roll the transaction back so that no data is
lost. The believe the docs cover this here
http://flume.apache.org/FlumeUserGuide.html

Brock

On Wed, Nov 7, 2012 at 1:52 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Thanks for the response.
>
> Does the memory channel provide transactional guarantees? In the event of
> a network packet loss, does it retry sending the packet? If we ensure that
> we do not exceed the capacity for the memory channel, does it continue
> retrying to send an event to the remote source on failure?
>
> Thanks,
> ~Rahul.
>
>   ------------------------------
> *From:* Brock Noland <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]; Rahul Ravindran <[EMAIL PROTECTED]>
> *Sent:* Wednesday, November 7, 2012 11:48 AM
>
> *Subject:* Re: Guarantees of the memory channel for delivering to sink
>
> Hi,
>
> Yes if you use memory channel, you can lose data. To not lose data, file
> channel needs to write to disk...
>
> Brock
>
> On Wed, Nov 7, 2012 at 1:29 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:
>
> Ping on the below questions about new Spool Directory source:
>
> If we choose to use the memory channel with this source, to an Avro sink
> on a remote box, do we risk data loss in the eventuality of a network
> partition/slow network or if the flume-agent on the source box dies?
> If we choose to use file channel with this source, we will result in
> double writes to disk, correct? (one for the legacy log files which will be
> ingested by the Spool Directory source, and the other for the WAL)
>
>
>   ------------------------------
> *From:* Rahul Ravindran <[EMAIL PROTECTED]>
>  *To:* "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> *Sent:* Tuesday, November 6, 2012 3:40 PM
>
> *Subject:* Re: Guarantees of the memory channel for delivering to sink
>
> This is awesome.
> This may be perfect for our use case :)
>
> When is the 1.3 release expected?
>
> Couple of questions for the choice of channel for the new source:
>
> If we choose to use the memory channel with this source, to an Avro sink
> on a remote box, do we risk data loss in the eventuality of a network
> partition/slow network or if the flume-agent on the source box dies?
> If we choose to use file channel with this source, we will result in
> double writes to disk, correct? (one for the legacy log files which will be
> ingested by the Spool Directory source, and the other for the WAL)
>
> Thanks,
> ~Rahul.
>
>   ------------------------------
> *From:* Brock Noland <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]; Rahul Ravindran <[EMAIL PROTECTED]>
> *Sent:* Tuesday, November 6, 2012 3:05 PM
> *Subject:* Re: Guarantees of the memory channel for delivering to sink
>
> This use case sounds like a perfect use of the Spool DIrectory source
> which will be in the upcoming 1.3 release.
>
> Brock
>
> On Tue, Nov 6, 2012 at 4:53 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:
> > We will update the checkpoint each time (we may tune this to be periodic)
> > but the contents of the memory channel will be in the legacy logs which
> are
> > currently being generated.
> >
> > Additionally, the sink for the memory channel will be an Avro source in
> > another machine.
> >
> > Does that clear things up?
> >
> > ________________________________
> > From: Brock Noland <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]; Rahul Ravindran <[EMAIL PROTECTED]>
> > Sent: Tuesday, November 6, 2012 1:44 PM
> >
> > Subject: Re: Guarantees of the memory channel for delivering to sink
> >
> > But in your architecture you are going to write the contents of the
> > memory channel out? Or did I miss something?
> >
> > "The checkpoint will be updated each time we perform a successive
> > insertion into the memory channel."
> >
> > On Tue, Nov 6, 2012 at 3:43 PM, Rahul Ravindran <[EMAIL PROTECTED]>
> wrote:
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
+
Rahul Ravindran 2012-11-07, 21:18
+
Roshan Naik 2012-11-07, 22:57