Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> java.lang.OutOfMemoryError: Direct buffer memory on HDSF sink


Copy link to this message
-
java.lang.OutOfMemoryError: Direct buffer memory on HDSF sink
Hi Guys,
My topology is like this:
I have set up 2 flume nodes, from avrc to hdfs:

StormAgent.sources = avro
StormAgent.channels = MemChannel
StormAgent.sinks = HDFS

StormAgent.sources.avro.type = avro
StormAgent.sources.avro.channels = MemChannel
StormAgent.sources.avro.bind = ip
StormAgent.sources.avro.port = 41414

StormAgent.sinks.HDFS.channel = MemChannel
StormAgent.sinks.HDFS.type = hdfs
StormAgent.sinks.HDFS.hdfs.path =
maprfs:///hive/mytable/partition=%{partition}
StormAgent.sinks.HDFS.hdfs.fileType = SequenceFile
StormAgent.sinks.HDFS.hdfs.batchSize = 10000
StormAgent.sinks.HDFS.hdfs.rollSize = 15000000
StormAgent.sinks.HDFS.hdfs.rollCount = 0
StormAgent.sinks.HDFS.hdfs.rollInterval = 360

StormAgent.channels.MemChannel.type = memory
StormAgent.channels.MemChannel.capacity = 100000
StormAgent.channels.MemChannel.transactionCapacity = 100000

As you can see, the channel capacity is pretty big. I assigned 2g mem to
flume when starting flume-ng.

Then in my storm topology, I read streaming data, and then calling flume
load balance(RPC client) to write to avrc, the related config looks like
this:

flume-avro-forward.client.type=default_loadbalance

flume-avro-forward.host-selector=random

flume-avro-forward.hosts=h1 h2

#dev0

flume-avro-forward.hosts.h1=ip1:41414

#dev1

flume-avro-forward.hosts.h2=ip2:41414

Everything works fine.However, after running some time, I will receive the
outofmemory error in my storm worker:

java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:632)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:97)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:288)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$Preallocation.(SocketSendBufferPool.java:151)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool.(SocketSendBufferPool.java:38)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:115)
at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:47)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:34)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:26)
at org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.(AbstractNioWorkerPool.java:57)
at org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:29)
at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:148)
at org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:113)
at org.apache.flume.api.NettyAvroRpcClient.connect(NettyAvroRpcClient.java:140)
and flume log will also show exception.

The streaming data is big: every 6 min is a partition and there is around
20M bytes data, thats why I set the rollsize to be 15000000 and
rollinterval to be 6*60s to avoid small files generated under the
partition. I could use some help/guidence on the tune up of the config. I
tried to lower transactionCapacity from 100000 to 50000, but still receive
the exception. I also tried to have a third flume node, and this time it
takes longer before I see the exception. What should I try ?

Thanks in advance.
Chen

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB