Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Hadoop throughput question

Artem Ervits 2013-01-03, 22:00
Copy link to this message
RE: Hadoop throughput question
Let's suppose you are doing a read-intensive job like, for example, counting records.  This is will be disk bandwidth limited.  On a 4-node cluster with 2 local SATA on each node you should easily read 400MB/sec in aggregate.  When you are running the Hadoop cluster, is the Hadoop processing co-located with the Ilsilon nodes?  Is Hadoop configured to use OneFS or HDFS?

From: Artem Ervits [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 03, 2013 3:00 PM
Subject: Hadoop throughput question

Hello all,

I'd like to pick the community brain on average throughput speeds for a moderately specced 4-node Hadoop cluster with 1GigE networking. Is it reasonable to expect constant average speeds of 150-200mb/sec on such setup? Forgive me if the question is loaded but we're Hadoop cluster with HDFS served via EMC Isilon storage. We're getting about 30mb/sec with our machines and we do not see a difference in job speed between 2 node cluster and 4 node cluster.

Thank you.
Artem Ervits 2013-01-03, 23:02
John Lilley 2013-01-03, 23:09
Michael Segel 2013-01-04, 00:11
Artem Ervits 2013-01-04, 00:00
Michael Katzenellenbogen 2013-01-04, 00:27
Artem Ervits 2013-01-04, 01:03
Aaron Eng 2013-01-04, 00:10
Michael Katzenellenbogen 2013-01-03, 22:08
Artem Ervits 2013-01-03, 22:46