Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> multioutput dfs.datanode.max.xcievers and too many open files


Copy link to this message
-
multioutput dfs.datanode.max.xcievers and too many open files
Hey there,
I've been running a cluster for about a year (about 20 machines). I've run
many concurrent jobs there and some of them with multiOutput and never had
any problem (multiOutputs where creating just 3 or 4 different outputs).
Now I've a job with multiOutputs that creates 100 different outputs and it
always end up with errors.
Tasks start throwing this erros:

java.io.IOException: Bad connect ack with firstBadLink 10.2.0.154:50010
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2963)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329)
or:
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:250)
at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
at org.apache.hadoop.io.Text.readString(Text.java:400)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2961)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2888)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1900(DFSClient.java:2139)
at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2329)
Checking the datanode log I see hundreds of times this error:
2012-02-23 14:22:56,008 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Reopen already-open Block
for append blk_3368446040000470452_29464903
2012-02-23 14:22:56,008 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_3368446040000470452_29464903 received exception
java.net.SocketException: Too many open files
2012-02-23 14:22:56,008 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.2.0.156:50010,
storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:97)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
        at
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118)
2012-02-23 14:22:56,034 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block
blk_-2698946892792040969_29464904 src: /10.2.0.156:40969 dest:
/10.2.0.156:50010
2012-02-23 14:22:56,035 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock
blk_-2698946892792040969_29464904 received exception
java.net.SocketException: Too many open files
2012-02-23 14:22:56,035 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
DatanodeRegistration(10.2.0.156:50010,
storageID=DS-1194175480-10.2.0.156-50010-1329304363220, infoPort=50075,
ipcPort=50020):DataXceiver
java.net.SocketException: Too many open files
        at sun.nio.ch.Net.socket0(Native Method)
        at sun.nio.ch.Net.socket(Net.java:97)
        at sun.nio.ch.SocketChannelImpl.<init>(SocketChannelImpl.java:84)
        at
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37)
        at java.nio.channels.SocketChannel.open(SocketChannel.java:105)
        at
org.apache.hadoop.hdfs.server.datanode.DataNode.newSocket(DataNode.java:429)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:296)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118)
I've always had configured in hdfs-site.xml:
        <property>
                <name>dfs.datanode.max.xcievers</name>
                <value>4096</value>
        </property>

But I think now it's not enough to handle that many multipleOutputs. If I
increase  even more max.xcievers which are de side effects? Wich value
should be considered as maximum (I suppose it depends on the CPU and RAM,
but aprox).

Thanks in advance.

View this message in context: http://lucene.472066.n3.nabble.com/multioutput-dfs-datanode-max-xcievers-and-too-many-open-files-tp3770024p3770024.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB