Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Problem Events


Copy link to this message
-
Re: Problem Events
This was using the provided ElasticSearch sink.  The logs were not helpful.
 I ran it through with the debugger and found the source of the problem.

ContentBuilderUtil uses a very "aggressive" method to determine if the
content is JSON; if it contains a "{" anywhere in it, it's considered JSON.
 My body contained that but wasn't JSON, causing the JSON parser to throw a
CharConversionException from addComplexField(...) (but not the expected
JSONException).  We've changed addComplexField(...) to catch different
types of exceptions and fall back to treating it as a simple field.  We'll
probably submit a patch for this soon.

I'm reasonably happy with this, but I still think that in the bigger
picture there should be some sort of mechanism to automatically detect and
toss / skip / flag problematic events without them plugging up the flow.

-- Jeremy
On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <[EMAIL PROTECTED]> wrote:

> Jeremy, would it be possible for you to show us logs for the part where
> the sink fails to remove an event from the channel? I am assuming this is a
> standard sink that Flume provides and not a custom one.
>
> The reason I ask is because sinks do not introspect the event, and hence
> there is no reason why it will fail during the event's removal. It is more
> likely that there is a problem within the channel in that it cannot
> dereference the event correctly. Looking at the logs will help us identify
> the root cause for what you are experiencing.
>
> Regards,
> Arvind Prabhakar
>
>
> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>
>> Both reasonable suggestions.  What would a custom sink look like in this
>> case, and how would I only eliminate the problem events since I don't know
>> what they are until they are attempted by the "real" sink?
>>
>> My philosophical concern (in general) is that we're taking the approach
>> of exhaustively finding and eliminating possible failure cases.  It's not
>> possible to eliminate every single failure case, so shouldn't there be a
>> method of last resort to eliminate problem events from the channel?
>>
>> -- Jeremy
>>
>>
>>
>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Or you could write a custom sink that removes this event (more work of
>>> course)
>>>
>>>
>>> Thanks,
>>> Hari
>>>
>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote:
>>>
>>> if you have a way to identify such events.. you may be able to use the
>>> Regex interceptor to toss them out before they get into the channel.
>>>
>>>
>>>  On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>> Hi everyone.  My Flume adventures continue.
>>>
>>> I'm in a situation now where I have a channel that's filling because a
>>> stubborn message is stuck.  The sink won't accept it (for whatever reason;
>>> I can go into detail but that's not my point here).  This just blocks up
>>> the channel entirely, because it goes back into the channel when the sink
>>> refuses.  Obviously, this isn't ideal.
>>>
>>> I'm wondering what mechanisms, if any, Flume has to deal with these
>>> situations.  Things that come to mind might be:
>>>
>>> 1. Ditch the event after n attempts.
>>> 2. After n attempts, send the event to a "problem area" (maybe a
>>> different source / sink / channel?)  that someone can look at later.
>>> 3. Some sort of mechanism that allows operators to manually kill these
>>> messages.
>>>
>>> I'm open to suggestions on alternatives as well.
>>>
>>> Thanks.
>>>
>>> -- Jeremy
>>>
>>>
>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB