Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> EventDrivenSource and dead threads

Copy link to this message
Re: EventDrivenSource and dead threads
Hmm, overriding the implementation of getLifecycleState provided by
AbstractSource could work. It would be going against the convention that
has been maintained in all other components(that I can think of)

On 01/17/2013 01:20 PM, Brock Noland wrote:
> Hi,
> Yes I can definitely see the issue. It sucks that we'd have to add yet
> another thread. An alternative which wouldn't require another thread
> would be to check the optional interface in the supervisor,
> approximately here:
> https://github.com/apache/flume/blob/trunk/flume-ng-core/src/main/java/org/apache/flume/lifecycle/LifecycleSupervisor.java#L240
> However, I am not sold on the supervisor being the best place to fix
> this as I am not sure that other lifecycle components would need this.
> Brock
> On Wed, Jan 16, 2013 at 7:45 PM, Juhani Connolly
> <[EMAIL PROTECTED]> wrote:
>> I came upon an issue with ScribeSource,  though it's theoretically
>> applicable to any EventDrivenSource whose event generating thread(s) die.
>> Simple put, sending a bad packet to the thrift(scribe protocol) port will
>> result in it trying to allocate space for some arbitrarily large packet
>> resulting in an OOMException which kills the thread(incidentally I thought
>> this would be an issue in avro too, but it throws an exception before making
>> excessive allocation requests).
>> As far as flume is concerned, the component is still alive. stop() was never
>> called, so even monitoring the component state using jmx will not notice
>> anything wrong. This situation occurs from user error, but there is
>> potential for other errors leaving a zombie component. I think it would be
>> more user friendly to be able to recover from such errors.
>> I'm thinking of adding a StatusPollable interface that EventDrivenSources
>> can optionally implement(because we can't change the interface without a
>> version change). If implemented, the EventDrivenSourceRunner would schedule
>> a regular poll to check the state. Upon failure it could either call stop()
>> to signal it broke. With autoRestartPolicy, the source would then get
>> restarted by its supervisor.
>> Would appreciate any opinions before I put together a patch/post an issue.