Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010


Copy link to this message
-
Re: DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:50373 dest: /x.x.x.x:50010
This variable was already set:
<property>
   <name>dfs.datanode.max.xcievers</name>
   <value>4096</value>
   <final>true</final>
</property>

Should I increase it more?

Same error happening every 5-8 minutes in the datanode 172.17.2.18.

2013-03-10 15:26:42,818 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
PSLBHDN002:50010:DataXceiver error processing READ_BLOCK operation  src:
/172.17.2.18:46422 dest: /172.17.2.18:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/172.17.2.18:50010
remote=/172.17.2.18:46422]
]$ lsof | wc -l
2393

]$ lsof | grep hbase | wc -l
4

]$ lsof | grep hdfs | wc -l
322

]$ lsof | grep hadoop | wc -l
162

]$ cat /proc/sys/fs/file-nr
4416    0    7327615

]$ date
Sun Mar 10 15:31:47 BRT 2013
What can be the causes? How could I extract more info about the error?

Thanks,
Pablo
On 03/08/2013 09:57 PM, Abdelrahman Shettia wrote:
> Hi,
>
> If all of the # of open files limit ( hbase , and hdfs : users ) are
> set to more than 30 K. Please change the dfs.datanode.max.xcievers to
> more than the value below.
>
> <property>
>
>    <name>dfs.datanode.max.xcievers</name>
>
>    <value>2096</value>
>
>        <description>PRIVATE CONFIG VARIABLE</description>
>
>              </property>
>
> Try to increase this one and tunne it to the hbase usage.
>
>
> Thanks
>
> -Abdelrahman
>
>
>
>
>
>
> On Fri, Mar 8, 2013 at 9:28 AM, Pablo Musa <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     I am also having this issue and tried a lot of solutions, but
>     could not solve it.
>
>     ]# ulimit -n ** running as root and hdfs (datanode user)
>     32768
>
>     ]# cat /proc/sys/fs/file-nr
>     2080    0    8047008
>
>     ]# lsof | wc -l
>     5157
>
>     Sometimes this issue happens from one node to the same node :(
>
>     I also think this issue is messing with my regionservers which are
>     crashing all day long!!
>
>     Thanks,
>     Pablo
>
>
>     On 03/08/2013 06:42 AM, Dhanasekaran Anbalagan wrote:
>>     Hi Varun
>>
>>     I believe is not ulimit issue.
>>
>>
>>     /etc/security/limits.conf
>>     # End of file
>>     *               -      nofile  1000000
>>     *               -      nproc 1000000
>>
>>
>>     please guide me Guys, I want fix this. share your
>>     thoughts DataXceiver error.
>>
>>     Did I learn something today? If not, I wasted it.
>>
>>
>>     On Fri, Mar 8, 2013 at 3:50 AM, varun kumar <[EMAIL PROTECTED]
>>     <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>         Hi Dhana,
>>
>>         Increase the ulimit for all the datanodes.
>>
>>         If you are starting the service using hadoop increase the
>>         ulimit value for hadoop user.
>>
>>         Do the  changes in the following file.
>>
>>         */etc/security/limits.conf*
>>
>>         Example:-
>>         *hadoop          soft    nofile    35000*
>>         *hadoop          hard    nofile    35000*
>>
>>         Regards,
>>         Varun Kumar.P
>>
>>         On Fri, Mar 8, 2013 at 1:15 PM, Dhanasekaran Anbalagan
>>         <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:
>>
>>             Hi Guys
>>
>>             I am frequently getting is error in my Data nodes.
>>
>>             Please guide what is the exact problem this.
>>
>>             dvcliftonhera138:50010:DataXceiver error processing WRITE_BLOCK operation src: /172.16.30.138:50373  <http://172.16.30.138:50373>  dest: /172.16.30.138:50010  <http://172.16.30.138:50010>
>>
>>
>>
>>             java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.30.138:34280  <http://172.16.30.138:34280>  remote=/172.16.30.140:50010  <http://172.16.30.140:50010>]
>>
>>
>>
>>
>>
>>             at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>>             at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:154)
+
Azuryy Yu 2013-03-11, 02:23