Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010


Copy link to this message
-
Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src:
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
remote=/172.17.2.18:46422]
]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013
What can be the causes? How could I extract more info about the error?

Thanks,
Pablo
On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <[EMAIL PROTECTED]
>>     <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB