Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # dev >> [jira] [Commented] (FLUME-1326) OutOfMemoryError in HDFSSink


Copy link to this message
-
[jira] [Commented] (FLUME-1326) OutOfMemoryError in HDFSSink

    [ https://issues.apache.org/jira/browse/FLUME-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893482#comment-13893482 ]

dave sinclair commented on FLUME-1326:
--------------------------------------

Sorry, been meaning to post this patch. Let me know what you think Hari.

thanks

> OutOfMemoryError in HDFSSink
> ----------------------------
>
>                 Key: FLUME-1326
>                 URL: https://issues.apache.org/jira/browse/FLUME-1326
>             Project: Flume
>          Issue Type: Bug
>    Affects Versions: v1.2.0, v1.3.0, v1.4.0
>            Reporter: Juhani Connolly
>            Priority: Critical
>              Labels: hdfssink, memory_leak
>         Attachments: FLUME-1326.patch
>
>
> We run a 3 node/1 collector test cluster pushing about 350events/sec per node... Not really high stress, but just something to evaluate flume with.
> Consistently our collector has been dying because of an OOMError killing the SinkRunner after running for about 30-40 hours(seems pretty consistent as we've had it 3 times now).
> Suspected cause would be a memory leak somewhere in HdfsSink. The feeder nodes which run AvroSink instead of HdfsSink have been up and running for about a week without restarts.
> flume-load/act-wap02/2012-06-26-17.1340697637324.tmp, packetSize=65557, chunksPerPacket=127, bytesCurBlock=29731328
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 2012-06-26 17:12:56,080 (SinkRunner-PollingRunner-DefaultSinkProcessor) [ERROR - org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:411)] process failed
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.Arrays.copyOfRange(Arrays.java:3209)
>         at java.lang.String.<init>(String.java:215)
>         at java.lang.StringBuilder.toString(StringBuilder.java:430)
>         at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
>         at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
>         at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:619)
> Exception in thread "SinkRunner-PollingRunner-DefaultSinkProcessor" java.lang.OutOfMemoryError: GC overhead limit exceeded
>         at java.util.Arrays.copyOfRange(Arrays.java:3209)
>         at java.lang.String.<init>(String.java:215)
>         at java.lang.StringBuilder.toString(StringBuilder.java:430)
>         at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:306)
>         at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:367)
>         at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>         at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>         at java.lang.Thread.run(Thread.java:619)

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)