Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa, mail # user - the check point offset is bigger than the log file size


Copy link to this message
-
Re: the check point offset is bigger than the log file size
IvyTang 2012-05-16, 07:15
Hi Ari,
   Thanks for you replay.

   We in deed encounter some problems in using CharFileTailingAdaptorUTF8 .

   The method tailFile in FileTailingAdaptor,

*           RandomAccessFile newReader = new RandomAccessFile(toWatch, "r");
*
*        len = reader.length();*
*        long newLength = newReader.length();*
*        if (newLength < len && fileReadOffset >= len) {*
*          if (reader != null) {*
*            reader.close();*
*          }*
*          *
*          reader = newReader;*
*          fileReadOffset = 0L;*
*          log.debug("Adaptor|"+ adaptorID + "| File size mismatched,
rotating: "*
*              + toWatch.getAbsolutePath());*
*  *
* *When filetailing adaptor finds the log file has rotated ,the reader is
assigned to the new reader . Does this means the log which haven't been
sent in the old log file is missing ?
On Wed, May 16, 2012 at 2:51 PM, Ariel Rabkin <[EMAIL PROTECTED]> wrote:

> Rotation is a bit of a mess.
>
> We've tried a couple strategies to handle it, none of which are perfect.
> One approach is to have a modified logger that explicitly invokes
> chukwa, starting and stopping adaptors.
> The other is that the FileTailingAdaptors keep not only a physical
> "how long is the file" offset, but a logical "what is the byte number
> of the first byte of the file" -- the idea is that if the file
> rotates, the adaptor should add the length of the rotated-out section
> to the length of the current file.
>
> This is a bit fragile, since the adaptor has to guess which was the
> previously-rotated file. I believe we use timestamps for that. I
> suspect it won't always work.
>
> --Ari
>
> On Tue, May 15, 2012 at 11:45 PM, IvyTang <[EMAIL PROTECTED]> wrote:
> >     After reading the source code ,i'm confuesd about the checkpoint
> file .
> >
> >     The file tailer generate the chunks into the memlimitqueue, the
> > httpsender get the chunks to send from the  memlimitqueue. And after the
> > httpsender send the chunks to collector reliably
> ,the reportCommit(Adaptor
> > src, long uuid) will be called.
> >
> >    In this reportCommit(Adaptor src, long uuid) method, the src is the
> > adaptor , the uuid is the offset of those chunks which have beend in the
> > file .And if the uuid is >  adaptor.offset , the means some chunks have
> been
> > sent , so the adaptor.offset is assigned to the uuid.
> >
> >   This works file when the log file is  not rotating .
> >
> >     But if the log file is rotating(i mean the way like log4j , move this
> > file to *.1 and generate a file named *), the  adaptor.offset is the
> offset
> > of those chunks been sent in last file , it's of course very big . but
> uuid
> > is the offset of chunks been sent of this file , the uuid is smaller the
> > the adaptor.offset .
> >
> >     So the checkpoint file won't change .
> >
> >     Even though chukwa is still sending chunks to collector , but if
> chukwa
> > restarted , the checkpoint is larger than the log file size , the log
> file
> > will be sent again.
> >
> >
> >
> > On Mon, May 14, 2012 at 7:01 PM, IvyTang <[EMAIL PROTECTED]> wrote:
> >>
> >> The gamelog size is 158023223, but the check point file is
> >>
> >> ADD adaptor_2963225a90653a309cf779d4a1d815a3 > >>
> org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> >> Gamelog 0 /var/log/dataproxy/gamelog 229406124
> >>
> >> The gamelog didn't rotate , i'm sure.
> >>
> >> But the check point file size is bigger than the file size , the chukwa
> >> WARN Thread-2 FileTailingAdaptor -
> >> Adaptor|adaptor_2963225a90653a309cf779d4a1d815a3| file:
> >> /var/log/dataproxy/gamelog, has rotated and no detection - reset
> counters to
> >> 0L
> >> And the agent began to transfer the whole log file.
> >>
> >> I just feel confused why agent generate a offset size is bigger than the
> >> log size when the gamelog did not rotate.
> >>
> >> The chukwa version is 0.4.0
> >>
> >> --
> >> Best regards,
> >>
> >> Ivy Tang
> >>
> >>
> >>
> >
> >

Best regards,

Ivy Tang