|
|
-
Re: Change the FileTailingAdaptor tailFile(),let it apply to the log4j rotated log filesIvyTang 2012-07-10, 09:53
Thanks for you reply!
I will do this in JIRA issue. On Tue, Jul 10, 2012 at 4:36 PM, Ahmed Fathalla <[EMAIL PROTECTED]> wrote: > Ivy, > > Thanks alot for your contributions! > > However, a better way to submit patches to Chukwa is: > > 1-Open a JIRA issue on > > https://issues.apache.org/jira/browse/CHUKWA > > I think you will need to register on JIRA first in order to create issues > > 2- Generate a .patch file using eclipse or command line tools, the .patch > file will be a single file with all the changes you have done. > > > 3- Using the "Submit Patch" option on the JIRA issue upload the file. Your > changes will be reviewed by a committer and committed if everything i okay. > > Thank you again for your interest in Chukwa and we really appreciate your > proactive approach in contributing back to the project! > > On Tue, Jul 10, 2012 at 10:04 AM, IvyTang <[EMAIL PROTECTED]> wrote: > >> Our team has used chukwa *CharFileTailingAdaptorUTF8* to collect the >> log4j rotated log files for several months.It does help us to collect the >> logs from everywhere to our hadoop center. >> During the work , we met several problems . And i have raised them in >> this mail list , but i still haven't got a good solution. >> So we read the source code , and did some changes >> >> Our log files are generated by the log4j ,and the log4j appender is >> org.apache.log4j.DailyRollingFileAppender. >> If you use log4j to generate the rotated log ,may this mail will help you. >> >> These two problems are the causes why we have to modify the source code. >> >> 1. The mismatching checkpoint size and file size. >> >> I raised this problem in May 14 ,"the check point offset is bigger >> than the log file size". And Ariel Rabkin and Eric have answered my >> question , thanks for your replies. >> >> When chukwa starts, it will read the the check point file , let the >> size be the filereadoffset. The size in the checkpoint indicates how many >> bytes the adaptor has send . >> >> If the log source is stream or a file won't rotate , this size is >> right ,it indeed is the filereadoffset.But the file is rorated , the >> checkpoint size is often bigger than the file size ,and this will cause >> chukwa resend all the log file. >> >> So we add a "log.info("chunk seqID:"+c.getSeqID());" in >> ChukwaHttpSender:send. >> >> *for (Chunk c : toSend) { >> DataOutputBuffer b = new >> DataOutputBuffer(c.getSerializedSizeEstimate()); >> try { >> c.write(b); >> } catch (IOException err) { >> log.error("serialization threw IOException", err); >> } >> serializedEvents.add(b); >> // store a CLE for this chunk which we will use to ack this chunk >> to the >> // caller of send() >> // (e.g. the agent will use the list of CLE's for checkpointing) >> log.info("chunk seqID:"+c.getSeqID()); >> commitResults.add(new CommitListEntry(c.getInitiator(), >> c.getSeqID(), >> c.getSeqID() - c.getData().length)); >> }* >> * >> **The seqid is the offset of the send chunks in this log file.** >> * So when we need to restart the chukwa, we just need to stop the >> chukwa , change the size in checkpoint to the last chunk seqid in log and >> start chukwa. >> We also can directly apply the seqID to checkpoint size ,but we >> don't know if this will cause other problems. >> * >> >> *2.* *The method tailFile in FileTailingAdaptor is the core code of >> collecting the log. The code use the fileReadOffset , file length to detect >> the rotated file. >> *RandomAccessFile newReader = new RandomAccessFile(toWatch, "r"); >> * >> * len = reader.length();* >> * long newLength = newReader.length();* >> * if (newLength < len && fileReadOffset >= len) {* >> * if (reader != null) {* >> * reader.close();* >> * }* >> * * >> * reader = newReader;* >> * fileReadOffset = 0L;* >> * log.debug("Adaptor|"+ adaptorID + "| File size mismatched, Best regards, Ivy Tang |