Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Re: Poor IO performance on a 10 node cluster.


+
Raja Nagendra Kumar 2011-07-17, 02:15
+
Gyuribácsi 2011-05-30, 12:27
Copy link to this message
-
Re: Poor IO performance on a 10 node cluster.
Some things which helped us include setting your vm.swappiness to 0 and
mounting your disks with noatime,nodiratime options.

Also make sure your disks aren't setup with RAID (JBOD is recommended)

You might want to run terasort as you tweak your environment.  It's very
helpful when checking if a change helped (or hurt) your cluster.

Hope that helps a bit.

On 05/30/2011 06:27 AM, Gyurib�csi wrote:
>
> Hi,
>
> I have a 10 node cluster (IBM blade servers, 48GB RAM, 2x500GB Disk, 16 HT
> cores).
>
> I've uploaded 10 files to HDFS. Each file is 10GB. I used the streaming jar
> with 'wc -l' as mapper and 'cat' as reducer.
>
> I use 64MB block size and the default replication (3).
>
> The wc on the 100 GB took about 220 seconds which translates to about 3.5
> Gbit/sec processing speed. One disk can do sequential read with 1Gbit/sec so
> i would expect someting around 20 GBit/sec (minus some overhead), and I'm
> getting only 3.5.
>
> Is my expectaion valid?
>
> I checked the jobtracked and it seems all nodes are working, each reading
> the right blocks. I have not played with the number of mapper and reducers
> yet. It seems the number of mappers is the same as the number of blocks and
> the number of reducers is 20 (there are 20 disks). This looks ok for me.
>
> We also did an experiment with TestDFSIO with similar results. Aggregated
> read io speed is around 3.5Gbit/sec. It is just too far from my
> expectation:(
>
> Please help!
>
> Thank you,
> Gyorgy
>
+
Ted Dunning 2011-06-02, 04:05
+
praveen.peddi@... 2011-05-30, 12:53
+
Brian Bockelman 2011-05-30, 15:19
+
Boris Aleksandrovsky 2011-05-30, 15:22
+
James Seigel 2011-05-30, 17:01
+
Harsh J 2011-05-30, 17:32
+
He Chen 2011-05-30, 18:39
+
jagaran das 2011-05-30, 19:56
+
Jason Rutherglen 2011-05-30, 18:22
+
Lance Norskog 2011-05-30, 22:01
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB