Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Problem Events


Copy link to this message
-
Re: Problem Events
Roshan Naik 2013-08-01, 17:26
some questions:
- why is the sink unable to consume the event ?
- how would you like to identify such an event ? by examining its content ?
or by the fact that its ping-pong-ing between channel and sink ?
- what would you prefer to do with such an event ? merely drop it ?
On Thu, Aug 1, 2013 at 9:26 AM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:

> To my knowledge (which is admittedly limited), there is no way to deal
> with these in a way that will make your day.  I'm happy if someone can say
> otherwise.
>
> This is very similar to a problem I had a week or two ago.  I fixed it by
> restarting Flume with debugging on, connecting to it with the debugger, and
> finding the message in the sink.  Discover a bug in the sink.  Downloaded
> Flume, fixed bug, recompiled, installed custom version, etc.
>
> I agree that this is not a practical solution, and I still believe that
> Flume needs some sort of "sink of last resort" option or something, like
> JMS implementations.
>
> -- Jeremy
>
>
>
> On Thu, Aug 1, 2013 at 2:42 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
>
>> The message is already in the channel.
>> Is there a way to write an interceptor to work after the channel? or
>> before the sink?
>>
>> The only thing I found is to stop everything and delete the channel
>> files, but I won't be able to use this approach in production :-(
>>
>>
>> On Thu, Aug 1, 2013 at 11:13 AM, Ashish <[EMAIL PROTECTED]> wrote:
>>
>>>
>>>
>>>
>>> On Thu, Aug 1, 2013 at 1:29 PM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm having the same problem with HDFS sink.
>>>>
>>>> A 'poison' message which doesn't have timestamp header in it as the
>>>> sink expects.
>>>> This causes a NPE which ends in returning the message to the channel ,
>>>> over and over again.
>>>>
>>>> Is my only option to re-write the HDFS sink?
>>>> Isn't there any way to intercept in the sink work?
>>>>
>>>
>>> You can write a custom interceptor and remove/modify the poison message.
>>>
>>> Interceptors are called before message makes it way into the channel.
>>>
>>> http://flume.apache.org/FlumeUserGuide.html#flume-interceptors
>>>
>>> I wrote a blog about it a while back
>>> http://www.ashishpaliwal.com/blog/2013/06/flume-cookbook-implementing-custom-interceptors/
>>>
>>>
>>>
>>>>
>>>> Thanks
>>>> Anat
>>>>
>>>>
>>>> On Fri, Jul 26, 2013 at 3:35 AM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Sounds like a bug in ElasticSearch sink to me. Do you mind filing a
>>>>> Jira to track this? Sample data to cause this would be even better.
>>>>>
>>>>> Regards,
>>>>> Arvind Prabhakar
>>>>>
>>>>>
>>>>> On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> This was using the provided ElasticSearch sink.  The logs were not
>>>>>> helpful.  I ran it through with the debugger and found the source of the
>>>>>> problem.
>>>>>>
>>>>>> ContentBuilderUtil uses a very "aggressive" method to determine if
>>>>>> the content is JSON; if it contains a "{" anywhere in it, it's considered
>>>>>> JSON.  My body contained that but wasn't JSON, causing the JSON parser to
>>>>>> throw a CharConversionException from addComplexField(...) (but not the
>>>>>> expected JSONException).  We've changed addComplexField(...) to catch
>>>>>> different types of exceptions and fall back to treating it as a simple
>>>>>> field.  We'll probably submit a patch for this soon.
>>>>>>
>>>>>> I'm reasonably happy with this, but I still think that in the bigger
>>>>>> picture there should be some sort of mechanism to automatically detect and
>>>>>> toss / skip / flag problematic events without them plugging up the flow.
>>>>>>
>>>>>> -- Jeremy
>>>>>>
>>>>>>
>>>>>> On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> Jeremy, would it be possible for you to show us logs for the part
>>>>>>> where the sink fails to remove an event from the channel? I am assuming
>>>>>>> this is a standard sink that Flume provides and not a custom one.