Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> TestDFSIO info required


Copy link to this message
-
TestDFSIO info required
Hi,

I ran TestDFSIO in my Hadoop cluster:
*hadoop jar /usr/lib/hadoop-0.20/hadoop-test.jar TestDFSIO -write -nrFiles
100 -fileSize 10240*
The report generated is:
*12/08/30 01:31:34 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write*

*12/08/30 01:31:34 INFO fs.TestDFSIO:            Date & time: Thu Aug 30
01:31:34 CDT 2012*

*12/08/30 01:31:34 INFO fs.TestDFSIO:        Number of files: 100*

*12/08/30 01:31:34 INFO fs.TestDFSIO: Total MBytes processed: 1024000.0*

*12/08/30 01:31:34 INFO fs.TestDFSIO:      Throughput mb/sec:
5.54130695296031*

*12/08/30 01:31:34 INFO fs.TestDFSIO: Average IO rate mb/sec:
5.875064849853516*

*12/08/30 01:31:34 INFO fs.TestDFSIO:  IO rate std deviation:
1.503623716482166*

*12/08/30 01:31:34 INFO fs.TestDFSIO:     Test exec time sec: 3490.168*

**

I was refering to the blog:

http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/

As per my understanding from that blog, I calculated *Throughput (1024000*1000)/3490.168 =  293395.61* which is not my throughput ofcourse.

Then I found a file in the HDFS output directory of the job:

*hadoop fs -cat /benchmarks/TestDFSIO/io_write/part-00000* gave me this:

*f:rate 587506.5
f:sqrate 3677727.2
l:size 1073741824000
l:tasks 100
l:time 184793950*

Then I applied this above time in the formula: *Throughput (1024000*1000)/184793950 = 5.541* which is my throughput.

Can someone tell me what exactly is this time in the HDFS output
directory file "part-0000" ?

Thanks,

Gaurav Dasgupta
+
Gaurav Dasgupta 2012-08-30, 10:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB