|
周梦想
2013-02-06, 02:45
Alexander Alten-Lorenz
2013-02-06, 07:50
周梦想
2013-02-07, 03:03
周梦想
2013-02-07, 03:24
Jeong-shik Jang
2013-02-07, 04:18
周梦想
2013-02-07, 08:33
hoo.smth
2013-02-16, 09:00
|
-
how to know which line the agent have sent ?周梦想 2013-02-06, 02:45
Hello,
I'm using tailDirs('mydir') source of the agent to gather logs to hadoop hdfs. I notice some documents advise that if the agent collapsed, I have to remove files in 'mydir' and clear flume.agent.logdir. Thus I will lose some data or have duplicate data. And I don't know which line the agent have sent to. I'm worrying about the agent failure and resend or miss-send the content to collector. I want to know how to check which line of log file the agent have sent if the agent exit suddenly. The files in flute log dir, such as sending,sent can't be read. Please give some advise to process such situation. Thanks. Andy Zhou
-
Re: how to know which line the agent have sent ?Alexander Alten-Lorenz 2013-02-06, 07:50
You haven't a control in such situations, since tailDir uses tail and holds the marker in memory.
We had few days ago a thread about: http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance - Alex On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote: > Hello, > > I'm using tailDirs('mydir') source of the agent to gather logs to hadoop > hdfs. I notice some documents advise that if the agent collapsed, I have > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will lose > some data or have duplicate data. And I don't know which line the agent > have sent to. > > I'm worrying about the agent failure and resend or miss-send the content to > collector. I want to know how to check which line of log file the agent > have sent if the agent exit suddenly. The files in flute log dir, such as > sending,sent can't be read. > > Please give some advise to process such situation. > Thanks. > > Andy Zhou -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
-
Re: how to know which line the agent have sent ?周梦想 2013-02-07, 03:03
I see,there is no easy way or configure way to know the detail of what has
sent and what haven't. I have to write my own source or sink code to do this. Thank you,Alex and all friends. Andy 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED]> > You haven't a control in such situations, since tailDir uses tail and > holds the marker in memory. > > We had few days ago a thread about: > > http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance > > - Alex > > On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote: > > > Hello, > > > > I'm using tailDirs('mydir') source of the agent to gather logs to hadoop > > hdfs. I notice some documents advise that if the agent collapsed, I have > > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will lose > > some data or have duplicate data. And I don't know which line the agent > > have sent to. > > > > I'm worrying about the agent failure and resend or miss-send the content > to > > collector. I want to know how to check which line of log file the agent > > have sent if the agent exit suddenly. The files in flute log dir, such as > > sending,sent can't be read. > > > > Please give some advise to process such situation. > > Thanks. > > > > Andy Zhou > > -- > Alexander Alten-Lorenz > http://mapredit.blogspot.com > German Hadoop LinkedIn Group: http://goo.gl/N8pCF > >
-
Re: how to know which line the agent have sent ?周梦想 2013-02-07, 03:24
So all users of flume don't care the agent break down and miss send or
duplicate the content of logs? They have to write their own sources and sinks? They don't care the correct of logs? How they do if the flume agent exited? I'm not yet understand. Andy 2013/2/7 周梦想 <[EMAIL PROTECTED]> > I see,there is no easy way or configure way to know the detail of what has > sent and what haven't. > I have to write my own source or sink code to do this. > Thank you,Alex and all friends. > > Andy > > > 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED]> > >> You haven't a control in such situations, since tailDir uses tail and >> holds the marker in memory. >> >> We had few days ago a thread about: >> >> http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance >> >> - Alex >> >> On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote: >> >> > Hello, >> > >> > I'm using tailDirs('mydir') source of the agent to gather logs to hadoop >> > hdfs. I notice some documents advise that if the agent collapsed, I >> have >> > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will >> lose >> > some data or have duplicate data. And I don't know which line the agent >> > have sent to. >> > >> > I'm worrying about the agent failure and resend or miss-send the >> content to >> > collector. I want to know how to check which line of log file the agent >> > have sent if the agent exit suddenly. The files in flute log dir, such >> as >> > sending,sent can't be read. >> > >> > Please give some advise to process such situation. >> > Thanks. >> > >> > Andy Zhou >> >> -- >> Alexander Alten-Lorenz >> http://mapredit.blogspot.com >> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >> >> >
-
Re: how to know which line the agent have sent ?Jeong-shik Jang 2013-02-07, 04:18
I am not sure if there is a simple and perfect solution for both loss
and duplication at failure using Flume or other. for example with Flume-OG, using E2E reliability mode, you can minimize loss but duplication can happen; using BE mode with startFromEnd=true for tail, you can minimize duplication but loss can happen. At this moment, we are using combination of our own plug-ins to minimize the affect at failure and monitoring/alert system to response quickly. -JS On 2/7/13 12:24 PM, 锟斤拷锟斤拷锟斤拷 wrote: > So all users of flume don't care the agent break down and miss send or > duplicate the content of logs? They have to write their own sources > and sinks? > They don't care the correct of logs? How they do if the flume agent > exited? > I'm not yet understand. > > Andy > > 2013/2/7 锟斤拷锟斤拷锟斤拷 <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> > > I see锟斤拷there is no easy way or configure way to know the detail of > what has sent and what haven't. > I have to write my own source or sink code to do this. > Thank you,Alex and all friends. > > Andy > > > 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> > > You haven't a control in such situations, since tailDir uses > tail and holds the marker in memory. > > We had few days ago a thread about: > http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance > > - Alex > > On Feb 6, 2013, at 3:45 AM, 锟斤拷锟斤拷锟斤拷 <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > > Hello, > > > > I'm using tailDirs('mydir') source of the agent to gather > logs to hadoop > > hdfs. I notice some documents advise that if the agent > collapsed, I have > > to remove files in 'mydir' and clear flume.agent.logdir. > Thus I will lose > > some data or have duplicate data. And I don't know which > line the agent > > have sent to. > > > > I'm worrying about the agent failure and resend or miss-send > the content to > > collector. I want to know how to check which line of log > file the agent > > have sent if the agent exit suddenly. The files in flute log > dir, such as > > sending,sent can't be read. > > > > Please give some advise to process such situation. > > Thanks. > > > > Andy Zhou > > -- > Alexander Alten-Lorenz > http://mapredit.blogspot.com > German Hadoop LinkedIn Group: http://goo.gl/N8pCF > > > -- Jeong-shik Jang / [EMAIL PROTECTED] Gruter, Inc., R&D Team Leader www.gruter.com Enjoy Connecting
-
Re: how to know which line the agent have sent ?周梦想 2013-02-07, 08:33
OK,than you very much, JS
Andy 2013/2/7 Jeong-shik Jang <[EMAIL PROTECTED]> > I am not sure if there is a simple and perfect solution for both loss > and duplication at failure using Flume or other. > for example with Flume-OG, > using E2E reliability mode, you can minimize loss but duplication can > happen; using BE mode with startFromEnd=true for tail, you can minimize > duplication but loss can happen. > > At this moment, we are using combination of our own plug-ins to minimize > the affect at failure and monitoring/alert system to response quickly. > > -JS > > > On 2/7/13 12:24 PM, 周梦想 wrote: > > So all users of flume don't care the agent break down and miss send or > duplicate the content of logs? They have to write their own sources and > sinks? > They don't care the correct of logs? How they do if the flume agent exited? > I'm not yet understand. > > Andy > > 2013/2/7 周梦想 <[EMAIL PROTECTED]> > >> I see,there is no easy way or configure way to know the detail of what >> has sent and what haven't. >> I have to write my own source or sink code to do this. >> Thank you,Alex and all friends. >> >> Andy >> >> >> 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED]> >> >>> You haven't a control in such situations, since tailDir uses tail and >>> holds the marker in memory. >>> >>> We had few days ago a thread about: >>> >>> http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance >>> >>> - Alex >>> >>> On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote: >>> >>> > Hello, >>> > >>> > I'm using tailDirs('mydir') source of the agent to gather logs to >>> hadoop >>> > hdfs. I notice some documents advise that if the agent collapsed, I >>> have >>> > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will >>> lose >>> > some data or have duplicate data. And I don't know which line the agent >>> > have sent to. >>> > >>> > I'm worrying about the agent failure and resend or miss-send the >>> content to >>> > collector. I want to know how to check which line of log file the agent >>> > have sent if the agent exit suddenly. The files in flute log dir, such >>> as >>> > sending,sent can't be read. >>> > >>> > Please give some advise to process such situation. >>> > Thanks. >>> > >>> > Andy Zhou >>> >>> -- >>> Alexander Alten-Lorenz >>> http://mapredit.blogspot.com >>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >>> >>> >> > > > -- > Jeong-shik Jang / [EMAIL PROTECTED] > Gruter, Inc., R&D Team Leaderwww.gruter.com > Enjoy Connecting > >
-
Re: how to know which line the agent have sent ?hoo.smth 2013-02-16, 09:00
Such question trouble me a long time,too.
My situation is that I need to transport very important data via flume, any loss or duplication is not allowed. Is flume-ng suitable this situation? Thank you very much. On Thu, Feb 7, 2013 at 4:33 PM, 周梦想 <[EMAIL PROTECTED]> wrote: > OK,than you very much, JS > > Andy > > > 2013/2/7 Jeong-shik Jang <[EMAIL PROTECTED]> > >> I am not sure if there is a simple and perfect solution for both loss >> and duplication at failure using Flume or other. >> for example with Flume-OG, >> using E2E reliability mode, you can minimize loss but duplication can >> happen; using BE mode with startFromEnd=true for tail, you can minimize >> duplication but loss can happen. >> >> At this moment, we are using combination of our own plug-ins to minimize >> the affect at failure and monitoring/alert system to response quickly. >> >> -JS >> >> >> On 2/7/13 12:24 PM, 周梦想 wrote: >> >> So all users of flume don't care the agent break down and miss send or >> duplicate the content of logs? They have to write their own sources and >> sinks? >> They don't care the correct of logs? How they do if the flume agent >> exited? >> I'm not yet understand. >> >> Andy >> >> 2013/2/7 周梦想 <[EMAIL PROTECTED]> >> >>> I see,there is no easy way or configure way to know the detail of what >>> has sent and what haven't. >>> I have to write my own source or sink code to do this. >>> Thank you,Alex and all friends. >>> >>> Andy >>> >>> >>> 2013/2/6 Alexander Alten-Lorenz <[EMAIL PROTECTED]> >>> >>>> You haven't a control in such situations, since tailDir uses tail and >>>> holds the marker in memory. >>>> >>>> We had few days ago a thread about: >>>> >>>> http://search-hadoop.com/m/JV0lh2RDXLX/flume+tail+source+problem+and+performance&subj=flume+tail+source+problem+and+performance >>>> >>>> - Alex >>>> >>>> On Feb 6, 2013, at 3:45 AM, 周梦想 <[EMAIL PROTECTED]> wrote: >>>> >>>> > Hello, >>>> > >>>> > I'm using tailDirs('mydir') source of the agent to gather logs to >>>> hadoop >>>> > hdfs. I notice some documents advise that if the agent collapsed, I >>>> have >>>> > to remove files in 'mydir' and clear flume.agent.logdir. Thus I will >>>> lose >>>> > some data or have duplicate data. And I don't know which line the >>>> agent >>>> > have sent to. >>>> > >>>> > I'm worrying about the agent failure and resend or miss-send the >>>> content to >>>> > collector. I want to know how to check which line of log file the >>>> agent >>>> > have sent if the agent exit suddenly. The files in flute log dir, >>>> such as >>>> > sending,sent can't be read. >>>> > >>>> > Please give some advise to process such situation. >>>> > Thanks. >>>> > >>>> > Andy Zhou >>>> >>>> -- >>>> Alexander Alten-Lorenz >>>> http://mapredit.blogspot.com >>>> German Hadoop LinkedIn Group: http://goo.gl/N8pCF >>>> >>>> >>> >> >> >> -- >> Jeong-shik Jang / [EMAIL PROTECTED] >> Gruter, Inc., R&D Team Leaderwww.gruter.com >> Enjoy Connecting >> >> > |