Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Hadoop throughput question


+
Artem Ervits 2013-01-03, 22:00
+
John Lilley 2013-01-03, 22:15
+
Artem Ervits 2013-01-03, 23:02
+
John Lilley 2013-01-03, 23:09
+
Michael Segel 2013-01-04, 00:11
+
Artem Ervits 2013-01-04, 00:00
+
Michael Katzenellenbogen 2013-01-04, 00:27
+
Artem Ervits 2013-01-04, 01:03
+
Aaron Eng 2013-01-04, 00:10
+
Michael Katzenellenbogen 2013-01-03, 22:08
Copy link to this message
-
RE: Hadoop throughput question
Artem Ervits 2013-01-03, 22:46
I have a 4.5gb file with records in SequenceFile format. If I use SequenceFile.Reader class to count the records in this file, which amount to 5.5million records, it takes 176seconds, or roughly 26mb/sec.

Thank you.

From: Michael Katzenellenbogen [mailto:[EMAIL PROTECTED]]
Sent: Thursday, January 03, 2013 5:08 PM
To: [EMAIL PROTECTED]
Subject: Re: Hadoop throughput question

Loaded question indeed.

How are you measuring that 30mb/s? Is that per machine / NIC? HDFS throughout? Some other metric?

-Michael

On Jan 3, 2013, at 5:01 PM, Artem Ervits <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hello all,

I'd like to pick the community brain on average throughput speeds for a moderately specced 4-node Hadoop cluster with 1GigE networking. Is it reasonable to expect constant average speeds of 150-200mb/sec on such setup? Forgive me if the question is loaded but we're Hadoop cluster with HDFS served via EMC Isilon storage. We're getting about 30mb/sec with our machines and we do not see a difference in job speed between 2 node cluster and 4 node cluster.

Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.

--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.