Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> EventDrivenSource and dead threads

Copy link to this message
Re: EventDrivenSource and dead threads
Sink, Source and Channel all extend LifecycleAware, so the function is
available to all components already.

I was more questioning whether it was reasonable to start including
logic to determine the state. That being said, I think the precedent of
just returning the state set by start/stop is more one of habit, so on
further thought I don't see it as being unreasonable. I'm going to give
fixing ScribeSource with it a poke.

As to new lifecycle states, I took a pass at reworking the lifecycle
model with guavas service implementation in the past, but it took some
very significant changes and didn't get the momentum/interest necessary
to keep working on it. That ticket is here
https://issues.apache.org/jira/browse/FLUME-966 . Brock might be working
on it now though the issue doesn't appear to have had attention since

On 01/17/2013 03:01 PM, Connor Woodson wrote:
> Why limit it to the sources? If there is going to be a change to one
> component's lifecycle, then I see no reason not to change every component's
> lifecycle.
> Sinks and Channels could very well have this problem; so what about giving
> each LifecycleAware component a takePulse method (or something); or to
> avoid creating a new method, add a new lifecycle state 'crashed' or such
> which, when detected, causes a restart of the component. Then components
> would just need to override getLifecycleState (if this method is polled
> regularly; I don't know. maybe use a listener for when there's a state
> change) to detect if it has crashed/needs to be restarted.
> Just my thoughts,
> - Connor
> On Wed, Jan 16, 2013 at 9:08 PM, Juhani Connolly <
>> Hmm, overriding the implementation of getLifecycleState provided by
>> AbstractSource could work. It would be going against the convention that
>> has been maintained in all other components(that I can think of)
>> On 01/17/2013 01:20 PM, Brock Noland wrote:
>>> Hi,
>>> Yes I can definitely see the issue. It sucks that we'd have to add yet
>>> another thread. An alternative which wouldn't require another thread
>>> would be to check the optional interface in the supervisor,
>>> approximately here:
>>> https://github.com/apache/**flume/blob/trunk/flume-ng-**
>>> core/src/main/java/org/apache/**flume/lifecycle/**
>>> LifecycleSupervisor.java#L240<https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/lifecycle/LifecycleSupervisor.java#L240>
>>> However, I am not sold on the supervisor being the best place to fix
>>> this as I am not sure that other lifecycle components would need this.
>>> Brock
>>> On Wed, Jan 16, 2013 at 7:45 PM, Juhani Connolly
>>> <juhani_connolly@cyberagent.**co.jp <[EMAIL PROTECTED]>>
>>> wrote:
>>>> I came upon an issue with ScribeSource,  though it's theoretically
>>>> applicable to any EventDrivenSource whose event generating thread(s) die.
>>>> Simple put, sending a bad packet to the thrift(scribe protocol) port will
>>>> result in it trying to allocate space for some arbitrarily large packet
>>>> resulting in an OOMException which kills the thread(incidentally I
>>>> thought
>>>> this would be an issue in avro too, but it throws an exception before
>>>> making
>>>> excessive allocation requests).
>>>> As far as flume is concerned, the component is still alive. stop() was
>>>> never
>>>> called, so even monitoring the component state using jmx will not notice
>>>> anything wrong. This situation occurs from user error, but there is
>>>> potential for other errors leaving a zombie component. I think it would
>>>> be
>>>> more user friendly to be able to recover from such errors.
>>>> I'm thinking of adding a StatusPollable interface that EventDrivenSources
>>>> can optionally implement(because we can't change the interface without a
>>>> version change). If implemented, the EventDrivenSourceRunner would
>>>> schedule
>>>> a regular poll to check the state. Upon failure it could either call