This is a Hadoop benchmark suite. You can decide which benchmarks match your needs.
https://github.com/intel-hadoop/hibench(Haven't used it yet!)
----- Original Message -----
| From: "Brian Bockelman" <[EMAIL PROTECTED]>
| To: [EMAIL PROTECTED]
| Sent: Tuesday, October 23, 2012 4:40:04 PM
| Subject: Re: measuring iops
|
| Hi Rita,
|
| I get a bit grumpy when I see IOPS as the primary metric with respect
| to HDFS.
|
| Why? While IOPS are actually a relevant part of the system, many use
| cases of HDFS are for a *throughput oriented* workflow. So, in the
| traditional M/R use cases for HDFS, you likely will barely scratch
| the IOPS the system provides.
|
| In fact, HDFS in 0.20 will create a separate TCP connection for each
| IOPS - that should tell you how low random-access workflows ranked
| on the HDFS designs.
|
| As a disclaimer, there are use cases (particularly HBase, and how I
| currently use our HDFS install!) where IOPS are quite relevant.
| Just recall that they are not the end-all, be-all for HDFS
| performance measurement. It's not the primary number I would look
| for! Each install will have their own requirements.
|
| Brian
|
| On Oct 23, 2012, at 6:01 PM, Rita <[EMAIL PROTECTED]> wrote:
|
| > I was curious because when a vendor (big storage company) presented
| > they
| > were offering a hadoop solution. They posted IOPS and I wasn't sure
| > how
| > they were determining this number....
| >
| >
| >
| > On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel
| > <[EMAIL PROTECTED]>wrote:
| >
| >> You have two issues.
| >>
| >> 1) You need to know the throughput in terms of data transfer
| >> between disks
| >> and controller cards on the node.
| >>
| >> 2) The actual network throughput of having all of the nodes
| >> talking to one
| >> another as fast as they can. This will let you see your real
| >> limitations in
| >> the ToR Switch's fabric.
| >>
| >> Not sure why you really want to do this except to test the disk,
| >> disk
| >> controller, and then networking infrastructure of your ToR and
| >> then your
| >> backplane to connect multiple racks....
| >>
| >>
| >> HTH
| >>
| >> -Mike
| >>
| >> On Oct 23, 2012, at 7:47 AM, Ravi Prakash <[EMAIL PROTECTED]>
| >> wrote:
| >>
| >>> Do you mean in a cluster being used by users, or as a benchmark
| >>> to
| >> measure the maximum?
| >>>
| >>> The JMX page <nn:port>/jmx provides some interesting stats, but
| >>> I'm not
| >> sure they have what you want. And I'm unaware of other tools which
| >> could.
| >>>
| >>>
| >>>
| >>>
| >>>
| >>> ________________________________
| >>> From: Rita <[EMAIL PROTECTED]>
| >>> To: [EMAIL PROTECTED]; Ravi Prakash
| >>> <[EMAIL PROTECTED]>
| >>> Sent: Monday, October 22, 2012 6:46 PM
| >>> Subject: Re: measuring iops
| >>>
| >>> Is it possible to know how many reads and writes are occurring
| >>> thru the
| >>> entire cluster in a consolidated manner -- this does not include
| >>> replication factors.
| >>>
| >>>
| >>> On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash
| >>> <[EMAIL PROTECTED]>
| >> wrote:
| >>>
| >>>> Hi Rita,
| >>>>
| >>>> SliveTest can help you measure the number of reads / writes /
| >>>> deletes /
| >> ls
| >>>> / appends per second your NameNode can handle.
| >>>>
| >>>> DFSIO can be used to help you measure the amount of throughput.
| >>>>
| >>>> Both these tests are actually very flexible and have a plethora
| >>>> of
| >> options
| >>>> to help you test different facets of performance. In my
| >>>> experience, you
| >>>> actually have to be very careful and understand what the tests
| >>>> are doing
| >>>> for the results to be sensible.
| >>>>
| >>>> HTH
| >>>> Ravi
| >>>>
| >>>>
| >>>>
| >>>>
| >>>> ________________________________
| >>>> From: Rita <[EMAIL PROTECTED]>
| >>>> To: "<[EMAIL PROTECTED]>"
| >>>> <[EMAIL PROTECTED]>
| >>>> Sent: Monday, October 22, 2012 7:23 AM
| >>>> Subject: Re: measuring iops
| >>>>
| >>>> Anyone?
| >>>>
| >>>>
| >>>> On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]>
| >>>> wrote:
| >>>>
| >>>>> Hi,
| >>>>>
| >>>>> Was curious if there was a method to measure the total number
| >>>>> of IOPS
| >>>> (I/O
| >>>>> operations per second) on a HDFS cluster.
| >>>>>
| >>>>>
| >>>>>
| >>>>> --
| >>>>> --- Get your facts first, then you can distort them as you
| >>>>> please.--
| >>>>>
| >>>>
| >>>>
| >>>>
| >>>> --
| >>>> --- Get your facts first, then you can distort them as you
| >>>> please.--
| >>>>
| >>>
| >>>
| >>>
| >>> --
| >>> --- Get your facts first, then you can distort them as you
| >>> please.--
| >>
| >>
| >
| >
| > --
| > --- Get your facts first, then you can distort them as you
| > please.--
|
|