Please check my recent experience with the dataloss issue.
Performance degradtion is 9% with 10 Cleinet Threads , 8 node cluster, 24TB each machine. It is always safe to set this flag if your machine will get good amount of load.
From: Eric Hwang [[EMAIL PROTECTED]]
Sent: Friday, December 02, 2011 9:56 AM
To: Hairong Kuang
Cc: Zheng Shao; [EMAIL PROTECTED]
Subject: Re: another HDFS configuration for scribeH
What is the risk for this change? How much more testing do you think will be needed?
From: Hairong Kuang <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Thu, 1 Dec 2011 20:22:10 -0800
To: Internal Use <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Cc: Zheng Shao <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>, "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: another HDFS configuration for scribeH
I was debugging a bizar data corruption case in the silver cluster today and realized that there is a very important configuration that scribeH cluster should set. Could you please set dfs.datanode.synconclose to be true in ScribeH for next week's push? This is will guarantee that block data get persisted to disk on close, so preventing data loss when datanodes get rebooted.