Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> EventDrivenSource and dead threads


+
Juhani Connolly 2013-01-17, 03:45
+
Brock Noland 2013-01-17, 04:20
+
Juhani Connolly 2013-01-17, 05:08
+
Connor Woodson 2013-01-17, 06:01
+
Juhani Connolly 2013-01-17, 06:21
+
Connor Woodson 2013-01-17, 07:02
+
Brock Noland 2013-01-17, 07:37
+
Juhani Connolly 2013-01-17, 08:27
Copy link to this message
-
Re: EventDrivenSource and dead threads
I don't know if this is within the scope of this change, but sinks
sometimes will need to be restarted; for instance, I had an HDFS sink crash
from an Out of Memory error (caused by the JIRA I filed; I wasn't planning
on using a large heap) and it doesn't automatically restart; I don't know
if there's a nice way to detect of a sink/source has crashed, but if so it
would be nice to have them auto-restart when they go down.

- Connor
On Thu, Jan 17, 2013 at 12:27 AM, Juhani Connolly <
[EMAIL PROTECTED]> wrote:

> What I described isn't really taking control of lifecycle.
>
> What would happen is:
> - source start. State=start
> - OOM exception happens
> - getLifecycleState called, override calls server.isServing(). This is a
> lightweight call. If it returns false, it either calls stop() or local
> lifeCycleState is set to ERROR (I'm not actually sure how this is handled).
> The latter is probably the correct call in this case, but I'm not entirely
> sure if the supervisor will ever restart it then.
>
> The supervisor would still be doing any restarting. I guess it's really a
> matter of who detects the bad state(getLifecycleState or a scheduled
> runnable) and what that call is allowed to do(switch state to ERROR or call
> stop()). When the source breaks, changing its status to accurately reflect
> this is not breaking the paradigm(though calling stop() may be)
>
> It looks to me like switching the state to ERROR should be sufficient...
> Monitor should then try to start the source again. However not calling
> stop() before that may cause resource leakage? Could someone confirm the
> behavior in regards to ERROR?
>
>
> On 01/17/2013 04:02 PM, Connor Woodson wrote:
>
>> What I was trying to say is that the ScribeSource should not be
>> responsible
>> for restarting itself (just from my understanding of your idea; it breaks
>> the existing paradigm as components should not have to control their own
>> lifecycle). Going from Brock's link, I feel the most dynamic solution
>> would
>> be possible to just add in a lifecyle state RESTART and place it in that
>> switch statement; when that state is reached, it tries to stop then
>> restart
>> the component and then sets the desired state to START (or STOP if it
>> couldn't start it again, or set error=true if it couldn't stop it).
>>
>> And in a way to prevent overriding getLifecycleState to return RESTART,
>> there could either be an inherited function from AbstractSource to call
>> for
>> a restart, there could maybe be an option to set it to RESTART on crash,
>> or
>> something else.
>>
>> I'll admit though that I know little about the lifecycle system, so I have
>> no idea if this idea is any better.
>>
>> - Connor
>>
>>
>> On Wed, Jan 16, 2013 at 10:21 PM, Juhani Connolly <
>> [EMAIL PROTECTED].**jp <[EMAIL PROTECTED]>>
>> wrote:
>>
>>  Sink, Source and Channel all extend LifecycleAware, so the function is
>>> available to all components already.
>>>
>>> I was more questioning whether it was reasonable to start including logic
>>> to determine the state. That being said, I think the precedent of just
>>> returning the state set by start/stop is more one of habit, so on further
>>> thought I don't see it as being unreasonable. I'm going to give fixing
>>> ScribeSource with it a poke.
>>>
>>> As to new lifecycle states, I took a pass at reworking the lifecycle
>>> model
>>> with guavas service implementation in the past, but it took some very
>>> significant changes and didn't get the momentum/interest necessary to
>>> keep
>>> working on it. That ticket is here https://issues.apache.org/**
>>> jira/browse/FLUME-966 <https://issues.apache.org/**jira/browse/FLUME-966<https://issues.apache.org/jira/browse/FLUME-966>>
>>> .
>>>
>>> Brock might be working on it now though the issue doesn't appear to have
>>> had attention since october.
>>>
>>>
>>> On 01/17/2013 03:01 PM, Connor Woodson wrote:
>>>
>>>  Why limit it to the sources? If there is going to be a change to one
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB