Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Lots of Different Kind of Datanode Errors


Copy link to this message
-
RE: Lots of Different Kind of Datanode Errors
HDFS-1148
   - Andy
From: Gokulakannan M <[EMAIL PROTECTED]>
Subject: RE: Lots of Different Kind of Datanode Errors
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Date: Monday, June 7, 2010, 10:31 PM
 
 

 

Hi Andy,

             

            What is the reference of that fix?

  

 Thanks,

  Gokul

 

  

From: Andrew Purtell
[mailto:[EMAIL PROTECTED]]

Sent: Tuesday, June 08, 2010 1:24
AM

To: [EMAIL PROTECTED]

Subject: Re: Lots of Different
Kind of Datanode Errors

  
 
  
  Current synchronization on FSDataset seems not
  quite right. Doing what amounted to applying Todd's patch that modifies
  FSDataSet to use reentrant rwlocks cleared up that type of problem for
  us. 
  
  
  
    
  
  
    - Andy
  

  From: Jeff Whiting <[EMAIL PROTECTED]>

  Subject: Re: Lots of Different Kind of Datanode Errors

  To: [EMAIL PROTECTED]

  Date: Monday, June 7, 2010, 10:02 AM
  
  Thanks for the replies.  I have turned off swap
  on all the machines to prevent any swap problems.  I was pounding my
  hard drives quite hard.  I had a simulated 60 clients loading data as
  fast as I could into hbase with a map reduce export job going at the same
  time.  Would that scenario explain some of the errors I was seeing?

  

  Over the weekend under more of a normal load I haven't not any exception
  except for about 6 of these:

  2010-06-05 03:46:41,229 ERROR datanode.DataNode (DataXceiver.java:run(131)) -
  DatanodeRegistration(192.168.0.98:50010,
  storageID=DS-1806250311-192.168.0.98-50010-1274208294562, infoPort=50075,
  ipcPort=50020):DataXceiver

  org.apache.hadoop.hdfs.server.datanode.BlockAlreadyExistsException: Block
  blk_-1677111232590888964_4471547 is valid, and cannot be written to.

      at
  org.apache.hadoop.hdfs.server.datanode.FSDataset.writeToBlock(FSDataset.java:999)

  

  The reason the config shows 4096 is because I increased the xceiver account
  after the first email message in this thread.

  

  ~Jeff

  

  Allen Wittenauer wrote:  
  On Jun 4, 2010, at 12:03 PM, Todd Lipcon wrote:     
  Hi Jeff,  That seems like a reasonable config, but the error message you pasted indicated xceivers was set to 2048 instead of 4096.  Also, in my experience SocketTimeoutExceptions are usually due to swapping. Verify that your machines aren't swapping when you're under load.     
  Or doing any other heavy disk IO.     
    
  -- Jeff WhitingQualtrics Senior Software [EMAIL PROTECTED]