Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - JVM error while collecting from hour dividing log with flume-ng


Copy link to this message
-
Re: JVM error while collecting from hour dividing log with flume-ng
larryzhang 2013-03-13, 01:36
Great. I had updated jvm version to 1.6.0_31 yesterday, and it works
well till now.   Thanks a lot.
On 03/12/2013 12:07 AM, Brock Noland wrote:
> You are using a known bad jvm version. I would upgrade:
> http://wiki.apache.org/hadoop/HadoopJavaVersions
>
>
> On Mon, Mar 11, 2013 at 5:49 AM, larryzhang <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>
>     Hi,
>         I want to collect and analyse user logs every 5 minutes. Now
>     we have origin log file which generated by nginx and divided by
>     hour, about 30,000,000 logs per hour. The log format is like this:
>                60.222.199.118 - - [11/Mar/2013:16:00:00 +0800] "GET ....
>         Because I want to firstly collect the logs into file. so I
>     wrote a FileEventSink, just did some modification based on
>     org.apache.flume.sink.hdfs.BucketWriter.java and
>     org.apache.flume.sink.hdfs.HDFSEventSink.java. Following is my
>     flume config file:
>     =================>     a1.sources = r1
>     a1.channels = c1
>     a1.sinks = k1
>
>     a1.sources.r1.type = cn.larry.flume.source.MyExecSource      //I
>     need to fetch time and other info into headers, so I add these
>     logics based on ExecSource
>     a1.sources.r1.command = tail -n +0 -F /data2/log/log_2013031117
>     <tel:2013031117>.log
>     a1.sources.r1.channels = c1
>     a1.sources.r1.batchSize = 1         //I set this to 1 because
>     otherwise it will lost data at the end of the log file, I apply
>     this patch https://issues.apache.org/jira/browse/FLUME-1819 but it
>     seems no help...
>
>     a1.channels.c1.type = memory
>     a1.channels.c1.capacity = 1000000
>     a1.channels.c1.transactionCapacity = 10000
>
>     a1.sinks.k1.type = cn.larry.flume.sink.FileEventSink
>     a1.sinks.k1.channel = c1
>     a1.sinks.k1.file.path = /opt/livedata/%Y%m%d/%H
>     a1.sinks.k1.file.filePrefix = log-%Y%m%d%H%M
>     a1.sinks.k1.file.round = true
>     a1.sinks.k1.file.roundValue = 5
>     a1.sinks.k1.file.roundUnit = minute
>     a1.sinks.k1.file.rollInterval=300
>     a1.sinks.k1.file.rollSize=0
>     a1.sinks.k1.file.rollCount=0
>     a1.sinks.k1.file.batchSize=100
>
>        And because I need to change the source log file name each
>     hour, so I wrote a script, which does 3 things:
>          1. At the 1st minute per hour:
>              ->copy a new config file, which just change the source
>     log file name(a1.sources.r1.command = tail -n +0 -F
>     /data2/log/log_<new time>.log)
>              ->start new flume process which use the new conifg. (I
>     did this because if flume process die, it won't affect next hour)
>          2. At the 30th minute per hour:
>              -> kill the flume process of last hour.
>       This project has been run more than 10 days, most of time it
>     works well, but sometimes flume process crashed due to JVM error,
>     about once every 2 days! I used jvm version 1.6.0_27-ea. Here's
>     the log of error which happened on 2013-03-11:
>
>     2013-03-11 05:45:11,302 (file-k1-roll-timer-0) [INFO -
>     cn.larry.flume.sink.FileBucketWriter.renameBucket(FileBucketWriter.java:408)]
>     Renaming /opt/livedata/20130311/05
>     <tel:20130311%2F05>/log-201303110540.1362951611255.tmp to
>     /opt/livedata/20130311/05
>     <tel:20130311%2F05>/log-201303110540.1362951611255
>     #
>     # A fatal error has been detected by the Java Runtime Environment:
>     #
>     #  SIGSEGV (0xb) at pc=0x00002b4247be034e, pid=1463, tid=1098979648
>     #
>     # JRE version: 6.0_18-b07
>     # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
>     linux-amd64 )
>     # Problematic frame:
>     # V  [libjvm.so+0x2de34e]
>     #
>     # An error report file with more information is saved as:
>     # /opt/scripts/tvhadoop/flume/flume-1.3.0/bin/hs_err_pid1463.log
>     #
>     # If you would like to submit a bug report, please visit:
>     # http://java.sun.com/webapps/bugreport/crash.jsp
>     #
>     + exec /usr/local/jdk/bin/java -Xmx2048m