Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa, mail # dev - Creating a new adaptor: FileTailingAdaptor that would not cut lines


Copy link to this message
-
Re: Creating a new adaptor: FileTailingAdaptor that would not cut lines
Luangsay Sourygna 2013-04-24, 04:49
Sure, we can statically increase maxReadSize in the configuration. But the
fact is that we should handle two different situations:
- when a file is growing rapidly and we want quick response for the other
files: this mean we don't want a too big maxReadSize number (I guess this
was the inital idea for this parameter).
- when a line in a file is much bigger than the other lines and its size
can be superior to the initial maxReadSize value: this means we would like
a very high maxReadSize parameter.

Since maxReadSize can't be small and high at the same time, I propose a
"dynamic" value for this parameter.
Usually, this parameter should be small (128 kB for instance) and when an
very big line appears (when we have bufferRead == MAX_READ_SIZE AND
bytesUsed == 0), we should temporarly increase its value. Then, when the
big line is sent, get back to the initial value.

Makes sense?

Regards,

Sourygna

On Mon, Apr 22, 2013 at 6:25 AM, Eric Yang <[EMAIL PROTECTED]> wrote:

> maxReadSize can be increased in the configuration.  If using larger
> maxReadSize is preferred, we can update the default to be larger size.
>
> regards,
> Eric
>
> On Sun, Apr 21, 2013 at 3:07 PM, Luangsay Sourygna <[EMAIL PROTECTED]
> >wrote:
>
> > As I said before, I don't think Chukwa should handle those situations
> since
> > I think this is a "log rotation" problem.
> > Personally, I have never seen such problem (log4j RFA for instance has a
> > kind of "flexible" size and every rotated file ended with a \n).
> >
> > On the other side, there is a special situation I think Chukwa should
> take
> > care of.
> > Default value for configuration
> > "chukwaAgent.fileTailingAdaptor.maxReadSize" is 128kB, which means that
> if
> > a line/record is bigger than that size, the record won't be sent by the
> > agent.
> > We'll get a warning in the Chukwa's log, but the record will be lost (see
> > LWFTAdaptor.slurp() method).
> > In such case, would it be possible to temporally increase MAX_READ_SIZE
> so
> > that we are able to send
> > one record on the wire?
> >
> > Regards,
> >
> > Sourygna
> >
> >
> >
> >
> > On Sun, Apr 21, 2013 at 7:05 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> >
> > > Do we need to consider rotation base on size?  For example the last
> line
> > of
> > > the log file that reaches 300MB.  There is no line break in the first
> > file,
> > > but the entry continue to the next rotated log then have a line feed
> > > delimiter.  If we are splitting line base on \n, then we can
> reconstruct
> > > the full line between two files. I am not sure if this case need to be
> > > supported?
> > >
> > > regards,
> > > Eric
> > >
> > >
> > > On Fri, Apr 19, 2013 at 12:01 PM, Luangsay Sourygna <
> [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Well, log4j socket adaptor may be great if you control the software
> > that
> > > > generates logs.
> > > > That is not usually my case: customers don't really like having to
> > > install
> > > > a Chukwa agents
> > > > on their production servers so I don't want to think about telling
> them
> > > to
> > > > change the log system
> > > > of their software.
> > > >
> > > > As for partial line when log files rotate, I don't think this is
> > > something
> > > > Chukwa should manage (what
> > > > is more: how could Chukwa be aware there is a problem?).
> > > > To my view, this would be an error of the "logrotate" system. As far
> > as I
> > > > know, RFA and DRFA log4j
> > > > appenders handle quite well the rotation.
> > > >
> > > > Regards,
> > > >
> > > > Sourygna
> > > >
> > > >
> > > > On Fri, Apr 19, 2013 at 8:17 AM, Eric Yang <[EMAIL PROTECTED]>
> wrote:
> > > >
> > > > > I think the best solution is to use Log4j socket appender and
> Chukwa
> > > > log4j
> > > > > socket adaptor to get the full entry of the log without worry about
> > > line
> > > > > feed.  However, this solution only works with program that is
> written
> > > in
> > > > > Java, and does not keep a copy of existing log file on disk.
> > > >