Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> JVM error while collecting from hour dividing log with flume-ng


+
larryzhang 2013-03-11, 10:49
Copy link to this message
-
Re: JVM error while collecting from hour dividing log with flume-ng
You are using a known bad jvm version. I would upgrade:
http://wiki.apache.org/hadoop/HadoopJavaVersions
On Mon, Mar 11, 2013 at 5:49 AM, larryzhang <[EMAIL PROTECTED]> wrote:

>
>  Hi,
>     I want to collect and analyse user logs every 5 minutes. Now we have
> origin log file which generated by nginx and divided by hour, about
> 30,000,000 logs per hour. The log format is like this:
>            60.222.199.118 - - [11/Mar/2013:16:00:00 +0800] "GET ....
>     Because I want to firstly collect the logs into file. so I wrote a
> FileEventSink, just did some modification based on
> org.apache.flume.sink.hdfs.BucketWriter.java and
> org.apache.flume.sink.hdfs.HDFSEventSink.java. Following is my flume config
> file:
> =================> a1.sources = r1
> a1.channels = c1
> a1.sinks = k1
>
> a1.sources.r1.type = cn.larry.flume.source.MyExecSource      //I need to
> fetch time and other info into headers, so I add these logics based on
> ExecSource
> a1.sources.r1.command = tail -n +0 -F /data2/log/log_2013031117.log
> a1.sources.r1.channels = c1
> a1.sources.r1.batchSize = 1         //I set this to 1 because otherwise it
> will lost data at the end of the log file, I apply this patch
> https://issues.apache.org/jira/browse/FLUME-1819 but it seems no help...
>
> a1.channels.c1.type = memory
> a1.channels.c1.capacity = 1000000
> a1.channels.c1.transactionCapacity = 10000
>
> a1.sinks.k1.type = cn.larry.flume.sink.FileEventSink
> a1.sinks.k1.channel = c1
> a1.sinks.k1.file.path = /opt/livedata/%Y%m%d/%H
> a1.sinks.k1.file.filePrefix = log-%Y%m%d%H%M
> a1.sinks.k1.file.round = true
> a1.sinks.k1.file.roundValue = 5
> a1.sinks.k1.file.roundUnit = minute
> a1.sinks.k1.file.rollInterval=300
> a1.sinks.k1.file.rollSize=0
> a1.sinks.k1.file.rollCount=0
> a1.sinks.k1.file.batchSize=100
>
>    And because I need to change the source log file name each hour, so I
> wrote a script, which does 3 things:
>      1. At the 1st minute per hour:
>          ->copy a new config file, which just change the source log file
> name(a1.sources.r1.command = tail -n +0 -F /data2/log/log_<new time>.log)
>          ->start new flume process which use the new conifg. (I did this
> because if flume process die, it won't affect next hour)
>      2. At the 30th minute per hour:
>          -> kill the flume process of last hour.
>   This project has been run more than 10 days, most of time it works well,
> but sometimes flume process crashed  due to JVM error, about once every 2
> days! I used jvm version 1.6.0_27-ea. Here's the log of error which
> happened on 2013-03-11:
>
> 2013-03-11 05:45:11,302 (file-k1-roll-timer-0) [INFO -
> cn.larry.flume.sink.FileBucketWriter.renameBucket(FileBucketWriter.java:408)]
> Renaming /opt/livedata/20130311/05/log-201303110540.1362951611255.tmp to
> /opt/livedata/20130311/05/log-201303110540.1362951611255
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x00002b4247be034e, pid=1463, tid=1098979648
> #
> # JRE version: 6.0_18-b07
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b13 mixed mode
> linux-amd64 )
> # Problematic frame:
> # V  [libjvm.so+0x2de34e]
> #
> # An error report file with more information is saved as:
> # /opt/scripts/tvhadoop/flume/flume-1.3.0/bin/hs_err_pid1463.log
> #
> # If you would like to submit a bug report, please visit:
> #   http://java.sun.com/webapps/bugreport/crash.jsp
> #
> + exec /usr/local/jdk/bin/java -Xmx2048m -Dflume.root.logger=INFO,console
> -cp
> '/opt/tvhadoop/apache-flume-1.3.1-bin/conf:/opt/tvhadoop/apache-flume-1.3.1-bin/lib/*'
> -Djava.library.path= org.apache.flume.node.Application -f
> /opt/tvhadoop/apache-flume-1.3.1-bin/conf/flume_2013031106.conf -n a1
>
>    and the jvm dump file is in the attachments.
>    I wonder how to handle this problem.
>    And another question is about the execSource, I don't know why it loss
> data if batchsize > 1. If I use file channel, I must make a large batchsize
> to fulfill the throughputs...
Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
+
larryzhang 2013-03-13, 01:36