Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Re: [jira] [Created] (FLUME-1479) Multiple Sinks can connect to single Channel


Copy link to this message
-
Re: [jira] [Created] (FLUME-1479) Multiple Sinks can connect to single Channel
Wang, Yongkun | Yongkun |... 2012-08-10, 07:53
Hi Denny,

Yes, I agree. I cannot use restrictive policy to commit if the speed is
different among sinks. That's why I defined two flexible policies for
commit.
For e.g. coordinator could commit the transaction if M(0=<M<=N) fast sinks
acknowledge success to coordinator.

Regards,
Yongkun Wang
On 12/08/10 15:58, "Denny Ye" <[EMAIL PROTECTED]> wrote:

>hi Yongkun,
>    OK, you have chosen most important baseline with similar consuming
>rate
>for each Sink. Regularly, it's impossible in fact. Slowest Sink will be
>limitation or bottleneck in your design. If my first question is becoming
>false case, I think you provide simplified rollback model. Do you agree
>me?
>
>-Regards
>Denny Ye
>
>2012/8/10 Wang, Yongkun | Yongkun | BDD <[EMAIL PROTECTED]>
>
>> Hi Denny,
>>
>> Thanks for the questions. Answers inline.
>>
>> On 12/08/10 15:09, "Denny Ye" <[EMAIL PROTECTED]> wrote:
>>
>> >Yongkun,
>> >    Now, I understand your design. Thanks for your interpretation.
>> >    I have two questions, please help to explain, thanks!
>> >    1. Two Sinks have different consuming rate. If Channel have 1000
>> >events, sinkA consumed 800 events and sinkB consumed 100 events. When
>>we
>> >remove totally consumed events from Channel?
>>
>> In my design, I try to avoid this case, which means SinkA and SinkB will
>> be synchronized and both get 1000 events if the mode is replicating. In
>>my
>> design, the event is not removed by Sink (call channel.take() in
>>process()
>> of sink), instead events are removed by high level sink processor, who
>> will remove the event once sinks satisfy the transaction requirements.
>>
>> >    2. Exception happened at one Sink. Each Sink retrieve 100 events
>>from
>> >Channel, and exception happening at sinkA. sinkA should rollback.
>>What's
>> >the detailed activity in your thought?
>>
>> Yes, transaction control on multiple sinks is more complicated. In my
>> design, I have two policies to commit a multi-sink transaction (suppose
>>we
>> have N sinks):
>>
>> - When M(0=<M<=N) Sinks succeed, commit; e.g. value for M: ANY, ONE,
>> QUARUM, ALL
>> - When specified M(0<M<=N) Sinks (important sinks) succeed, commit;
>> - otherwise, rollback all sinks for current event.
>>
>>
>> Regards,
>> Yongkun
>>
>> >
>> >-Regards
>> >Denny Ye
>> >
>> >2012/8/10 Wang, Yongkun | Yongkun | BDD <[EMAIL PROTECTED]>
>> >
>> >> Hi Denny,
>> >>
>> >> I am working on the patch now, it's not difficult. I have listed the
>> >> changes in that JIRA.
>> >> I think you misunderstand my design, I didn't maintain the order of
>>the
>> >> events. Instead I make sure that each sink will get the same events
>>(or
>> >> different events specified by selector).
>> >>
>> >> Suppose Channel (mc) contains the following events: 4,3,2,1
>> >>
>> >> If simply enable it by configuration, it may work like this:
>> >> Sink "hsa" may get 1,3;
>> >> Sink "hsb" may get 2,4;
>> >> So different sink will get different data. Is this what user wants?
>> >>
>> >>
>> >> In my design, "hsa" and "hsb" will both get "4,3,2,1". This is a
>>typical
>> >> case when user want to fan-out the data into two places (eg. One for
>> >>batch
>> >> and and another for real-time analysis).
>> >>
>> >> Regards,
>> >> Yongkun Wang
>> >>
>> >>
>> >> On 12/08/10 14:29, "Denny Ye" <[EMAIL PROTECTED]> wrote:
>> >>
>> >> >hi Yongkun,
>> >> >
>> >> >   JIRA can be accessed now.
>> >> >
>> >> >   I think it might be difficult to understand the order of events
>>from
>> >> >your thought. If we don't care about the order, can discuss the
>>value
>> >>and
>> >> >feasibility.  In my opinion, data ingest flow is order unawareness,
>>at
>> >> >least, not such important for us. You can try to verify your
>>proposal
>> >>and
>> >> >give us result. It may be some difficulties in keeping transaction
>>with
>> >> >several Sinks.
>> >> >
>> >> >-Regards
>> >> >Denny Ye
>> >> >
>> >> >
>> >> >2012/8/10 Wang, Yongkun | Yongkun | BDD
>><[EMAIL PROTECTED]
>> >
>> >> >
>>