On the second thought, there should not be any racing.
You probably restart the hdfs cluster between the runs.
When you shutdown the cluster after the first run some files
may still remain unclosed. Then after restarting the cluster
you will have all their leases renewed, and if somebody tries to
to recreate an unclosed file he will fail with AlreadyBeingCreatedException.
If my guess is correct then you should keep the cluster running
between the consequent DFSIO runs.
Cleaning up will still help keeping benchmark data consistent.
If a bunch of files is recreated, hdfs will start removing the old file blocks.
This increases the internal load and skews the performance results.
On 5/14/2010 2:26 PM, Konstantin Shvachko wrote:
> Hi Lavanya,
> On 5/14/2010 10:51 AM, Lavanya Ramakrishnan wrote:
> > Hello,
> > I am running org.apache.hadoop.fs.TestDFSIO to benchmark our HDFS
> > installation and had a couple of questions regarding the same.
> > a) If I run the benchmark back to back in the same directory, I start
> > strange errors such as NotReplicatedYetException or
> > AlreadyBeingCreatedException (failed to create file .... on client 5,
> > because this file is already being created by DFSClient_.... on ...). It
> > seems like there might be some kind of race condition between the
> > replication from a previous run and subsequent runs. Is there any way to
> > avoid this?
> Yes this looks like a race with the previous run.
> You can just wait or run TestDFSIO -clean before the second run.
> > b) I have been testing with concurrent writers and see a significant
> drop in
> > throughput. I get about 60 MB/s for 1 writer and about 8 MB/s for 50
> > concurrent writers. Is this the known scalability limits for HDFS. Is
> > any way to configure this to perform better?
> It depends on the size and the configuration of your cluster.
> In general for consistent results with DFSIO it is better to set up 1 or 2
> tasks per node. And specify as many files for DFSIO as you have map slots.
> The idea is that all maps finish in one wave.
> Then you should get optimal performance.