Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - how to know which line the agent have sent ?


Copy link to this message
-
Re: how to know which line the agent have sent ?
hoo.smth 2013-02-16, 09:00
Such question trouble me a long time,too.
My situation is that I need to transport very important data via flume, any
loss or duplication is not allowed.
Is flume-ng suitable  this situation?
Thank you very much.

On Thu, Feb 7, 2013 at 4:33 PM, 周梦想 <[EMAIL PROTECTED]> wrote:

> OK,than you very much, JS
>
> Andy
>
>
> 2013/2/7 Jeong-shik Jang <[EMAIL PROTECTED]>
>
>>  I am not sure if there is a simple and perfect solution for both loss
>> and duplication at failure using Flume or other.
>> for example with Flume-OG,
>> using E2E reliability mode, you can minimize loss but duplication can
>> happen; using BE mode with startFromEnd=true for tail, you can minimize
>> duplication but loss can happen.
>>
>> At this moment, we are using combination of our own plug-ins to minimize
>> the affect at failure and monitoring/alert system to response quickly.
>>
>> -JS
>>
>>
>> On 2/7/13 12:24 PM, 周梦想 wrote:
>>
>> So all users of flume don't care the agent break down  and miss send or
>> duplicate the content of logs? They have to write their own sources and
>> sinks?
>> They don't care the correct of logs? How they do if the flume agent
>> exited?
>> I'm not yet  understand.
>>
>>  Andy
>>
>> 2013/2/7 周梦想 <[EMAIL PROTECTED]>
>>
>>> I see,there is no easy way or configure way to know the detail of what
>>> has sent and what haven't.
>>> I have to write my own source or sink code to do this.
>>> Thank you,Alex and all friends.
>>>
>>>  Andy
>>>
>>>
>>> 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED]>
>>>
>>>> You haven't a control in such situations, since tailDir uses tail and
>>>> holds the marker in memory.
>>>>
>>>> We had few days ago a thread about:
>>>>
>>>> http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance
>>>>
>>>> - Alex
>>>>
>>>> On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote:
>>>>
>>>> > Hello,
>>>> >
>>>> > I'm using tailDirs('mydir') source of the agent to gather logs to
>>>> hadoop
>>>> > hdfs. I notice some documents advise that if the agent collapsed,  I
>>>> have
>>>> > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will
>>>> lose
>>>> > some data or have duplicate data. And I don't know which line the
>>>> agent
>>>> > have sent to.
>>>> >
>>>> > I'm worrying about the agent failure and resend or miss-send the
>>>> content to
>>>> > collector. I want to know how to check which line of log file the
>>>> agent
>>>> > have sent if the agent exit suddenly. The files in flute log dir,
>>>> such as
>>>> > sending,sent can't be read.
>>>> >
>>>> > Please give some advise to process such situation.
>>>> > Thanks.
>>>> >
>>>> > Andy Zhou
>>>>
>>>>  --
>>>> Alexander Alten-Lorenz
>>>> http://mapredit.blogspot.com
>>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF
>>>>
>>>>
>>>
>>
>>
>> --
>> Jeong-shik Jang / [EMAIL PROTECTED]
>> Gruter, Inc., R&D Team Leaderwww.gruter.com
>> Enjoy Connecting
>>
>>
>