After reading the source code ,i'm confuesd about the checkpoint file .
The file tailer generate the chunks into the memlimitqueue, the
httpsender get the chunks to send from the memlimitqueue. And after the
httpsender send the chunks to collector reliably ,the reportCommit(Adaptor
src, long uuid) will be called.
In this reportCommit(Adaptor src, long uuid) method, the src is the
adaptor , the uuid is the offset of those chunks which have beend in the
file .And if the uuid is > adaptor.offset , the means some chunks have
been sent , so the adaptor.offset is assigned to the uuid.
This works file when the log file is not rotating .
But if the log file is rotating(i mean the way like log4j , move this
file to *.1 and generate a file named *), the adaptor.offset is the offset
of those chunks been sent in last file , it's of course very big . but uuid
is the offset of chunks been sent of this file , the uuid is smaller the
the adaptor.offset .
So the checkpoint file won't change .
Even though chukwa is still sending chunks to collector , but if chukwa
restarted , the checkpoint is larger than the log file size , the log file
will be sent again.
On Mon, May 14, 2012 at 7:01 PM, IvyTang <[EMAIL PROTECTED]> wrote:
> The gamelog size is 158023223, but the check point file is
> ADD adaptor_2963225a90653a309cf779d4a1d815a3 > org.apache.hadoop.chukwa.datacollection.adaptor.filetailer.CharFileTailingAdaptorUTF8
> Gamelog 0 /var/log/dataproxy/gamelog 229406124
> The gamelog didn't rotate , i'm sure.
> But the check point file size is bigger than the file size , the chukwa
> WARN Thread-2 FileTailingAdaptor -
> Adaptor|adaptor_2963225a90653a309cf779d4a1d815a3| file:
> /var/log/dataproxy/gamelog, has rotated and no detection - reset counters
> to 0L
> And the agent began to transfer the whole log file.
> I just feel confused why agent generate a offset size is bigger than the
> log size when the gamelog did not rotate.
> The chukwa version is 0.4.0
> Best regards,
> Ivy Tang