Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Problem Events


Copy link to this message
-
Re: Problem Events
On Thu, Aug 1, 2013 at 1:29 PM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm having the same problem with HDFS sink.
>
> A 'poison' message which doesn't have timestamp header in it as the sink
> expects.
> This causes a NPE which ends in returning the message to the channel ,
> over and over again.
>
> Is my only option to re-write the HDFS sink?
> Isn't there any way to intercept in the sink work?
>

You can write a custom interceptor and remove/modify the poison message.

Interceptors are called before message makes it way into the channel.

http://flume.apache.org/FlumeUserGuide.html#flume-interceptors

I wrote a blog about it a while back
http://www.ashishpaliwal.com/blog/2013/06/flume-cookbook-implementing-custom-interceptors/

>
> Thanks
> Anat
>
>
> On Fri, Jul 26, 2013 at 3:35 AM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:
>
>> Sounds like a bug in ElasticSearch sink to me. Do you mind filing a Jira
>> to track this? Sample data to cause this would be even better.
>>
>> Regards,
>> Arvind Prabhakar
>>
>>
>> On Thu, Jul 25, 2013 at 9:50 AM, Jeremy Karlson <[EMAIL PROTECTED]>wrote:
>>
>>> This was using the provided ElasticSearch sink.  The logs were not
>>> helpful.  I ran it through with the debugger and found the source of the
>>> problem.
>>>
>>> ContentBuilderUtil uses a very "aggressive" method to determine if the
>>> content is JSON; if it contains a "{" anywhere in it, it's considered JSON.
>>>  My body contained that but wasn't JSON, causing the JSON parser to throw a
>>> CharConversionException from addComplexField(...) (but not the expected
>>> JSONException).  We've changed addComplexField(...) to catch different
>>> types of exceptions and fall back to treating it as a simple field.  We'll
>>> probably submit a patch for this soon.
>>>
>>> I'm reasonably happy with this, but I still think that in the bigger
>>> picture there should be some sort of mechanism to automatically detect and
>>> toss / skip / flag problematic events without them plugging up the flow.
>>>
>>> -- Jeremy
>>>
>>>
>>> On Wed, Jul 24, 2013 at 7:51 PM, Arvind Prabhakar <[EMAIL PROTECTED]>wrote:
>>>
>>>> Jeremy, would it be possible for you to show us logs for the part where
>>>> the sink fails to remove an event from the channel? I am assuming this is a
>>>> standard sink that Flume provides and not a custom one.
>>>>
>>>> The reason I ask is because sinks do not introspect the event, and
>>>> hence there is no reason why it will fail during the event's removal. It is
>>>> more likely that there is a problem within the channel in that it cannot
>>>> dereference the event correctly. Looking at the logs will help us identify
>>>> the root cause for what you are experiencing.
>>>>
>>>> Regards,
>>>> Arvind Prabhakar
>>>>
>>>>
>>>> On Wed, Jul 24, 2013 at 3:56 PM, Jeremy Karlson <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> Both reasonable suggestions.  What would a custom sink look like in
>>>>> this case, and how would I only eliminate the problem events since I don't
>>>>> know what they are until they are attempted by the "real" sink?
>>>>>
>>>>> My philosophical concern (in general) is that we're taking the
>>>>> approach of exhaustively finding and eliminating possible failure cases.
>>>>>  It's not possible to eliminate every single failure case, so shouldn't
>>>>> there be a method of last resort to eliminate problem events from the
>>>>> channel?
>>>>>
>>>>> -- Jeremy
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jul 24, 2013 at 3:45 PM, Hari Shreedharan <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> Or you could write a custom sink that removes this event (more work
>>>>>> of course)
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Hari
>>>>>>
>>>>>> On Wednesday, July 24, 2013 at 3:36 PM, Roshan Naik wrote:
>>>>>>
>>>>>> if you have a way to identify such events.. you may be able to use
>>>>>> the Regex interceptor to toss them out before they get into the channel.
>>>>>>
>>>>>>
>>>>>>  On Wed, Jul 24, 2013 at 2:52 PM, Jeremy Karlson <
thanks
ashish

Blog: http://www.ashishpaliwal.com/blog
My Photo Galleries: http://www.pbase.com/ashishpaliwal