Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Datanode error


Copy link to this message
-
RE: Datanode error
I am sorry, but I received an error when I sent the message to the list and all responses were sent to my junk mail.
So I tried to send it again, and just then noticed your emails.

Sorry!!

-----Original Message-----
From: Harsh J [mailto:[EMAIL PROTECTED]]
Sent: segunda-feira, 23 de julho de 2012 11:07
To: [EMAIL PROTECTED]
Subject: Re: Datanode error

Pablo,

Perhaps you've forgotten about it but you'd ask the same question last week and you did have some responses on it. Please see your earlier thread at http://search-hadoop.com/m/0BOOh17ugmD

On Mon, Jul 23, 2012 at 7:27 PM, Pablo Musa <[EMAIL PROTECTED]> wrote:
> Hey guys,
> I have a cluster with 11 nodes (1 NN and 10 DNs) which is running and working.
> However my datanodes keep having the same errors, over and over.
>
> I googled the problems and tried different flags (ex:
> -XX:MaxDirectMemorySize=2G) and different configs (xceivers=8192) but could not solve it.
>
> Does anyone know what is the problem and how can I solve it? (the
> stacktrace is at the end)
>
> I am running:
> Java 1.7
> Hadoop 0.20.2
> Hbase 0.90.6
> Zoo 3.3.5
>
> % top -> shows low load average (6% most of the time up to 60%),
> already considering the number of cpus % vmstat -> shows no swap at
> all % sar -> shows 75% idle cpu in the worst case
>
> Hope you guys can help me.
> Thanks in advance,
> Pablo
>
> 2012-07-20 00:03:44,455 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /DN01:50010, dest:
> /DN01:43516, bytes: 396288, op: HDFS_READ, cliID: DFSClient_hb_rs_DN01,60020,1342734302945_1342734303427, offset: 54956544, srvID: DS-798921853-DN01-50010-1328651609047, blockid: blk_914960691839012728_14061688, duration:
> 480061254006
> 2012-07-20 00:03:44,455 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(DN01:50010, storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075, ipcPort=50020):Got exception while serving blk_914960691839012728_14061688 to /DN01:
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/DN01:50010 remote=/DN01:43516]
>         at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>         at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>         at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
>         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.jav
> a:175)
>
> 2012-07-20 00:03:44,455 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> DatanodeRegistration(DN01:50010,
> storageID=DS-798921853-DN01-50010-1328651609047, infoPort=50075,
> ipcPort=50020):DataXceiver
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/DN01:50010 remote=/DN01:43516]
>         at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:246)
>         at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
>         at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
>         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:397)
>         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:493)
>         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:279)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.jav
> a:175)
>
> 2012-07-20 00:12:11,949 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification

Harsh J