Flume, mail # user - java.lang.OutOfMemoryError: Direct buffer memory on HDSF sink - 2014-01-31, 07:58
Solr & Elasticsearch trainings in New York & San Francisco [more info][hide]
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
java.lang.OutOfMemoryError: Direct buffer memory on HDSF sink
Hi Guys,
My topology is like this:
I have set up 2 flume nodes, from avrc to hdfs:

StormAgent.sources = avro
StormAgent.channels = MemChannel
StormAgent.sinks = HDFS

StormAgent.sources.avro.type = avro
StormAgent.sources.avro.channels = MemChannel
StormAgent.sources.avro.bind = ip
StormAgent.sources.avro.port = 41414

StormAgent.sinks.HDFS.channel = MemChannel
StormAgent.sinks.HDFS.type = hdfs
StormAgent.sinks.HDFS.hdfs.path =
StormAgent.sinks.HDFS.hdfs.fileType = SequenceFile
StormAgent.sinks.HDFS.hdfs.batchSize = 10000
StormAgent.sinks.HDFS.hdfs.rollSize = 15000000
StormAgent.sinks.HDFS.hdfs.rollCount = 0
StormAgent.sinks.HDFS.hdfs.rollInterval = 360

StormAgent.channels.MemChannel.type = memory
StormAgent.channels.MemChannel.capacity = 100000
StormAgent.channels.MemChannel.transactionCapacity = 100000

As you can see, the channel capacity is pretty big. I assigned 2g mem to
flume when starting flume-ng.

Then in my storm topology, I read streaming data, and then calling flume
load balance(RPC client) to write to avrc, the related config looks like



flume-avro-forward.hosts=h1 h2





Everything works fine.However, after running some time, I will receive the
outofmemory error in my storm worker:

java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$Preallocation.(SocketSendBufferPool.java:151)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool.(SocketSendBufferPool.java:38)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:115)
at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:47)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:34)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:26)
at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.(AbstractNioWorkerPool.java:57)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:29)
at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:148)
at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:113)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:140)
and flume log will also show exception.

The streaming data is big: every 6 min is a partition and there is around
20M bytes data, thats why I set the rollsize to be 15000000 and
rollinterval to be 6*60s to avoid small files generated under the
partition. I could use some help/guidence on the tune up of the config. I
tried to lower transactionCapacity from 100000 to 50000, but still receive
the exception. I also tried to have a third flume node, and this time it
takes longer before I see the exception. What should I try ?

Thanks in advance.

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB