|
|
-
Re: Change the FileTailingAdaptor tailFile(),let it apply to the log4j rotated log filesAhmed Fathalla 2012-07-10, 08:36
Ivy,
Thanks alot for your contributions! However, a better way to submit patches to Chukwa is: 1-Open a JIRA issue on https://issues.apache.org/jira/browse/CHUKWA I think you will need to register on JIRA first in order to create issues 2- Generate a .patch file using eclipse or command line tools, the .patch file will be a single file with all the changes you have done. 3- Using the "Submit Patch" option on the JIRA issue upload the file. Your changes will be reviewed by a committer and committed if everything i okay. Thank you again for your interest in Chukwa and we really appreciate your proactive approach in contributing back to the project! On Tue, Jul 10, 2012 at 10:04 AM, IvyTang <[EMAIL PROTECTED]> wrote: > Our team has used chukwa *CharFileTailingAdaptorUTF8* to collect the > log4j rotated log files for several months.It does help us to collect the > logs from everywhere to our hadoop center. > During the work , we met several problems . And i have raised them in this > mail list , but i still haven't got a good solution. > So we read the source code , and did some changes > > Our log files are generated by the log4j ,and the log4j appender is > org.apache.log4j.DailyRollingFileAppender. > If you use log4j to generate the rotated log ,may this mail will help you. > > These two problems are the causes why we have to modify the source code. > > 1. The mismatching checkpoint size and file size. > > I raised this problem in May 14 ,"the check point offset is bigger > than the log file size". And Ariel Rabkin and Eric have answered my > question , thanks for your replies. > > When chukwa starts, it will read the the check point file , let the > size be the filereadoffset. The size in the checkpoint indicates how many > bytes the adaptor has send . > > If the log source is stream or a file won't rotate , this size is > right ,it indeed is the filereadoffset.But the file is rorated , the > checkpoint size is often bigger than the file size ,and this will cause > chukwa resend all the log file. > > So we add a "log.info("chunk seqID:"+c.getSeqID());" in > ChukwaHttpSender:send. > > *for (Chunk c : toSend) { > DataOutputBuffer b = new > DataOutputBuffer(c.getSerializedSizeEstimate()); > try { > c.write(b); > } catch (IOException err) { > log.error("serialization threw IOException", err); > } > serializedEvents.add(b); > // store a CLE for this chunk which we will use to ack this chunk to > the > // caller of send() > // (e.g. the agent will use the list of CLE's for checkpointing) > log.info("chunk seqID:"+c.getSeqID()); > commitResults.add(new CommitListEntry(c.getInitiator(), > c.getSeqID(), > c.getSeqID() - c.getData().length)); > }* > * > **The seqid is the offset of the send chunks in this log file.** > * So when we need to restart the chukwa, we just need to stop the > chukwa , change the size in checkpoint to the last chunk seqid in log and > start chukwa. > We also can directly apply the seqID to checkpoint size ,but we > don't know if this will cause other problems. > * > > *2.* *The method tailFile in FileTailingAdaptor is the core code of > collecting the log. The code use the fileReadOffset , file length to detect > the rotated file. > *RandomAccessFile newReader = new RandomAccessFile(toWatch, "r");* > * len = reader.length();* > * long newLength = newReader.length();* > * if (newLength < len && fileReadOffset >= len) {* > * if (reader != null) {* > * reader.close();* > * }* > * * > * reader = newReader;* > * fileReadOffset = 0L;* > * log.debug("Adaptor|"+ adaptorID + "| File size mismatched, > rotating: "* > * + toWatch.getAbsolutePath());* > * } else {* > * try {* > * if (newReader != null) {* > * newReader.close();* > * }* Ahmed Fathalla |