|
Rita
2012-10-21, 12:30
Rita
2012-10-22, 12:23
Ravi Prakash
2012-10-22, 14:28
Rita
2012-10-22, 23:46
Ravi Prakash
2012-10-23, 12:47
Michael Segel
2012-10-23, 13:19
Rita
2012-10-23, 23:01
Brian Bockelman
2012-10-23, 23:40
Lance Norskog
2012-10-26, 11:56
|
-
measuring iopsRita 2012-10-21, 12:30
Hi,
Was curious if there was a method to measure the total number of IOPS (I/O operations per second) on a HDFS cluster. -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsRita 2012-10-22, 12:23
Anyone?
On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: > Hi, > > Was curious if there was a method to measure the total number of IOPS (I/O > operations per second) on a HDFS cluster. > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsRavi Prakash 2012-10-22, 14:28
Hi Rita,
SliveTest can help you measure the number of reads / writes / deletes / ls / appends per second your NameNode can handle. DFSIO can be used to help you measure the amount of throughput. Both these tests are actually very flexible and have a plethora of options to help you test different facets of performance. In my experience, you actually have to be very careful and understand what the tests are doing for the results to be sensible. HTH Ravi ________________________________ From: Rita <[EMAIL PROTECTED]> To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> Sent: Monday, October 22, 2012 7:23 AM Subject: Re: measuring iops Anyone? On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: > Hi, > > Was curious if there was a method to measure the total number of IOPS (I/O > operations per second) on a HDFS cluster. > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsRita 2012-10-22, 23:46
Is it possible to know how many reads and writes are occurring thru the
entire cluster in a consolidated manner -- this does not include replication factors. On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: > Hi Rita, > > SliveTest can help you measure the number of reads / writes / deletes / ls > / appends per second your NameNode can handle. > > DFSIO can be used to help you measure the amount of throughput. > > Both these tests are actually very flexible and have a plethora of options > to help you test different facets of performance. In my experience, you > actually have to be very careful and understand what the tests are doing > for the results to be sensible. > > HTH > Ravi > > > > > ________________________________ > From: Rita <[EMAIL PROTECTED]> > To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> > Sent: Monday, October 22, 2012 7:23 AM > Subject: Re: measuring iops > > Anyone? > > > On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > Was curious if there was a method to measure the total number of IOPS > (I/O > > operations per second) on a HDFS cluster. > > > > > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > > > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsRavi Prakash 2012-10-23, 12:47
Do you mean in a cluster being used by users, or as a benchmark to measure the maximum?
The JMX page <nn:port>/jmx provides some interesting stats, but I'm not sure they have what you want. And I'm unaware of other tools which could. ________________________________ From: Rita <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; Ravi Prakash <[EMAIL PROTECTED]> Sent: Monday, October 22, 2012 6:46 PM Subject: Re: measuring iops Is it possible to know how many reads and writes are occurring thru the entire cluster in a consolidated manner -- this does not include replication factors. On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: > Hi Rita, > > SliveTest can help you measure the number of reads / writes / deletes / ls > / appends per second your NameNode can handle. > > DFSIO can be used to help you measure the amount of throughput. > > Both these tests are actually very flexible and have a plethora of options > to help you test different facets of performance. In my experience, you > actually have to be very careful and understand what the tests are doing > for the results to be sensible. > > HTH > Ravi > > > > > ________________________________ > From: Rita <[EMAIL PROTECTED]> > To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> > Sent: Monday, October 22, 2012 7:23 AM > Subject: Re: measuring iops > > Anyone? > > > On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > Was curious if there was a method to measure the total number of IOPS > (I/O > > operations per second) on a HDFS cluster. > > > > > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > > > > > > -- > --- Get your facts first, then you can distort them as you please.-- > -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsMichael Segel 2012-10-23, 13:19
You have two issues.
1) You need to know the throughput in terms of data transfer between disks and controller cards on the node. 2) The actual network throughput of having all of the nodes talking to one another as fast as they can. This will let you see your real limitations in the ToR Switch's fabric. Not sure why you really want to do this except to test the disk, disk controller, and then networking infrastructure of your ToR and then your backplane to connect multiple racks.... HTH -Mike On Oct 23, 2012, at 7:47 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: > Do you mean in a cluster being used by users, or as a benchmark to measure the maximum? > > The JMX page <nn:port>/jmx provides some interesting stats, but I'm not sure they have what you want. And I'm unaware of other tools which could. > > > > > > ________________________________ > From: Rita <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; Ravi Prakash <[EMAIL PROTECTED]> > Sent: Monday, October 22, 2012 6:46 PM > Subject: Re: measuring iops > > Is it possible to know how many reads and writes are occurring thru the > entire cluster in a consolidated manner -- this does not include > replication factors. > > > On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: > >> Hi Rita, >> >> SliveTest can help you measure the number of reads / writes / deletes / ls >> / appends per second your NameNode can handle. >> >> DFSIO can be used to help you measure the amount of throughput. >> >> Both these tests are actually very flexible and have a plethora of options >> to help you test different facets of performance. In my experience, you >> actually have to be very careful and understand what the tests are doing >> for the results to be sensible. >> >> HTH >> Ravi >> >> >> >> >> ________________________________ >> From: Rita <[EMAIL PROTECTED]> >> To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> >> Sent: Monday, October 22, 2012 7:23 AM >> Subject: Re: measuring iops >> >> Anyone? >> >> >> On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: >> >>> Hi, >>> >>> Was curious if there was a method to measure the total number of IOPS >> (I/O >>> operations per second) on a HDFS cluster. >>> >>> >>> >>> -- >>> --- Get your facts first, then you can distort them as you please.-- >>> >> >> >> >> -- >> --- Get your facts first, then you can distort them as you please.-- >> > > > > -- > --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsRita 2012-10-23, 23:01
I was curious because when a vendor (big storage company) presented they
were offering a hadoop solution. They posted IOPS and I wasn't sure how they were determining this number.... On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > You have two issues. > > 1) You need to know the throughput in terms of data transfer between disks > and controller cards on the node. > > 2) The actual network throughput of having all of the nodes talking to one > another as fast as they can. This will let you see your real limitations in > the ToR Switch's fabric. > > Not sure why you really want to do this except to test the disk, disk > controller, and then networking infrastructure of your ToR and then your > backplane to connect multiple racks.... > > > HTH > > -Mike > > On Oct 23, 2012, at 7:47 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: > > > Do you mean in a cluster being used by users, or as a benchmark to > measure the maximum? > > > > The JMX page <nn:port>/jmx provides some interesting stats, but I'm not > sure they have what you want. And I'm unaware of other tools which could. > > > > > > > > > > > > ________________________________ > > From: Rita <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED]; Ravi Prakash <[EMAIL PROTECTED]> > > Sent: Monday, October 22, 2012 6:46 PM > > Subject: Re: measuring iops > > > > Is it possible to know how many reads and writes are occurring thru the > > entire cluster in a consolidated manner -- this does not include > > replication factors. > > > > > > On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash <[EMAIL PROTECTED]> > wrote: > > > >> Hi Rita, > >> > >> SliveTest can help you measure the number of reads / writes / deletes / > ls > >> / appends per second your NameNode can handle. > >> > >> DFSIO can be used to help you measure the amount of throughput. > >> > >> Both these tests are actually very flexible and have a plethora of > options > >> to help you test different facets of performance. In my experience, you > >> actually have to be very careful and understand what the tests are doing > >> for the results to be sensible. > >> > >> HTH > >> Ravi > >> > >> > >> > >> > >> ________________________________ > >> From: Rita <[EMAIL PROTECTED]> > >> To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> > >> Sent: Monday, October 22, 2012 7:23 AM > >> Subject: Re: measuring iops > >> > >> Anyone? > >> > >> > >> On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: > >> > >>> Hi, > >>> > >>> Was curious if there was a method to measure the total number of IOPS > >> (I/O > >>> operations per second) on a HDFS cluster. > >>> > >>> > >>> > >>> -- > >>> --- Get your facts first, then you can distort them as you please.-- > >>> > >> > >> > >> > >> -- > >> --- Get your facts first, then you can distort them as you please.-- > >> > > > > > > > > -- > > --- Get your facts first, then you can distort them as you please.-- > > -- --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsBrian Bockelman 2012-10-23, 23:40
Hi Rita,
I get a bit grumpy when I see IOPS as the primary metric with respect to HDFS. Why? While IOPS are actually a relevant part of the system, many use cases of HDFS are for a *throughput oriented* workflow. So, in the traditional M/R use cases for HDFS, you likely will barely scratch the IOPS the system provides. In fact, HDFS in 0.20 will create a separate TCP connection for each IOPS - that should tell you how low random-access workflows ranked on the HDFS designs. As a disclaimer, there are use cases (particularly HBase, and how I currently use our HDFS install!) where IOPS are quite relevant. Just recall that they are not the end-all, be-all for HDFS performance measurement. It's not the primary number I would look for! Each install will have their own requirements. Brian On Oct 23, 2012, at 6:01 PM, Rita <[EMAIL PROTECTED]> wrote: > I was curious because when a vendor (big storage company) presented they > were offering a hadoop solution. They posted IOPS and I wasn't sure how > they were determining this number.... > > > > On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel <[EMAIL PROTECTED]>wrote: > >> You have two issues. >> >> 1) You need to know the throughput in terms of data transfer between disks >> and controller cards on the node. >> >> 2) The actual network throughput of having all of the nodes talking to one >> another as fast as they can. This will let you see your real limitations in >> the ToR Switch's fabric. >> >> Not sure why you really want to do this except to test the disk, disk >> controller, and then networking infrastructure of your ToR and then your >> backplane to connect multiple racks.... >> >> >> HTH >> >> -Mike >> >> On Oct 23, 2012, at 7:47 AM, Ravi Prakash <[EMAIL PROTECTED]> wrote: >> >>> Do you mean in a cluster being used by users, or as a benchmark to >> measure the maximum? >>> >>> The JMX page <nn:port>/jmx provides some interesting stats, but I'm not >> sure they have what you want. And I'm unaware of other tools which could. >>> >>> >>> >>> >>> >>> ________________________________ >>> From: Rita <[EMAIL PROTECTED]> >>> To: [EMAIL PROTECTED]; Ravi Prakash <[EMAIL PROTECTED]> >>> Sent: Monday, October 22, 2012 6:46 PM >>> Subject: Re: measuring iops >>> >>> Is it possible to know how many reads and writes are occurring thru the >>> entire cluster in a consolidated manner -- this does not include >>> replication factors. >>> >>> >>> On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash <[EMAIL PROTECTED]> >> wrote: >>> >>>> Hi Rita, >>>> >>>> SliveTest can help you measure the number of reads / writes / deletes / >> ls >>>> / appends per second your NameNode can handle. >>>> >>>> DFSIO can be used to help you measure the amount of throughput. >>>> >>>> Both these tests are actually very flexible and have a plethora of >> options >>>> to help you test different facets of performance. In my experience, you >>>> actually have to be very careful and understand what the tests are doing >>>> for the results to be sensible. >>>> >>>> HTH >>>> Ravi >>>> >>>> >>>> >>>> >>>> ________________________________ >>>> From: Rita <[EMAIL PROTECTED]> >>>> To: "<[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> >>>> Sent: Monday, October 22, 2012 7:23 AM >>>> Subject: Re: measuring iops >>>> >>>> Anyone? >>>> >>>> >>>> On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> wrote: >>>> >>>>> Hi, >>>>> >>>>> Was curious if there was a method to measure the total number of IOPS >>>> (I/O >>>>> operations per second) on a HDFS cluster. >>>>> >>>>> >>>>> >>>>> -- >>>>> --- Get your facts first, then you can distort them as you please.-- >>>>> >>>> >>>> >>>> >>>> -- >>>> --- Get your facts first, then you can distort them as you please.-- >>>> >>> >>> >>> >>> -- >>> --- Get your facts first, then you can distort them as you please.-- >> >> > > > -- > --- Get your facts first, then you can distort them as you please.--
-
Re: measuring iopsLance Norskog 2012-10-26, 11:56
This is a Hadoop benchmark suite. You can decide which benchmarks match your needs.
https://github.com/intel-hadoop/hibench (Haven't used it yet!) ----- Original Message ----- | From: "Brian Bockelman" <[EMAIL PROTECTED]> | To: [EMAIL PROTECTED] | Sent: Tuesday, October 23, 2012 4:40:04 PM | Subject: Re: measuring iops | | Hi Rita, | | I get a bit grumpy when I see IOPS as the primary metric with respect | to HDFS. | | Why? While IOPS are actually a relevant part of the system, many use | cases of HDFS are for a *throughput oriented* workflow. So, in the | traditional M/R use cases for HDFS, you likely will barely scratch | the IOPS the system provides. | | In fact, HDFS in 0.20 will create a separate TCP connection for each | IOPS - that should tell you how low random-access workflows ranked | on the HDFS designs. | | As a disclaimer, there are use cases (particularly HBase, and how I | currently use our HDFS install!) where IOPS are quite relevant. | Just recall that they are not the end-all, be-all for HDFS | performance measurement. It's not the primary number I would look | for! Each install will have their own requirements. | | Brian | | On Oct 23, 2012, at 6:01 PM, Rita <[EMAIL PROTECTED]> wrote: | | > I was curious because when a vendor (big storage company) presented | > they | > were offering a hadoop solution. They posted IOPS and I wasn't sure | > how | > they were determining this number.... | > | > | > | > On Tue, Oct 23, 2012 at 9:19 AM, Michael Segel | > <[EMAIL PROTECTED]>wrote: | > | >> You have two issues. | >> | >> 1) You need to know the throughput in terms of data transfer | >> between disks | >> and controller cards on the node. | >> | >> 2) The actual network throughput of having all of the nodes | >> talking to one | >> another as fast as they can. This will let you see your real | >> limitations in | >> the ToR Switch's fabric. | >> | >> Not sure why you really want to do this except to test the disk, | >> disk | >> controller, and then networking infrastructure of your ToR and | >> then your | >> backplane to connect multiple racks.... | >> | >> | >> HTH | >> | >> -Mike | >> | >> On Oct 23, 2012, at 7:47 AM, Ravi Prakash <[EMAIL PROTECTED]> | >> wrote: | >> | >>> Do you mean in a cluster being used by users, or as a benchmark | >>> to | >> measure the maximum? | >>> | >>> The JMX page <nn:port>/jmx provides some interesting stats, but | >>> I'm not | >> sure they have what you want. And I'm unaware of other tools which | >> could. | >>> | >>> | >>> | >>> | >>> | >>> ________________________________ | >>> From: Rita <[EMAIL PROTECTED]> | >>> To: [EMAIL PROTECTED]; Ravi Prakash | >>> <[EMAIL PROTECTED]> | >>> Sent: Monday, October 22, 2012 6:46 PM | >>> Subject: Re: measuring iops | >>> | >>> Is it possible to know how many reads and writes are occurring | >>> thru the | >>> entire cluster in a consolidated manner -- this does not include | >>> replication factors. | >>> | >>> | >>> On Mon, Oct 22, 2012 at 10:28 AM, Ravi Prakash | >>> <[EMAIL PROTECTED]> | >> wrote: | >>> | >>>> Hi Rita, | >>>> | >>>> SliveTest can help you measure the number of reads / writes / | >>>> deletes / | >> ls | >>>> / appends per second your NameNode can handle. | >>>> | >>>> DFSIO can be used to help you measure the amount of throughput. | >>>> | >>>> Both these tests are actually very flexible and have a plethora | >>>> of | >> options | >>>> to help you test different facets of performance. In my | >>>> experience, you | >>>> actually have to be very careful and understand what the tests | >>>> are doing | >>>> for the results to be sensible. | >>>> | >>>> HTH | >>>> Ravi | >>>> | >>>> | >>>> | >>>> | >>>> ________________________________ | >>>> From: Rita <[EMAIL PROTECTED]> | >>>> To: "<[EMAIL PROTECTED]>" | >>>> <[EMAIL PROTECTED]> | >>>> Sent: Monday, October 22, 2012 7:23 AM | >>>> Subject: Re: measuring iops | >>>> | >>>> Anyone? | >>>> | >>>> | >>>> On Sun, Oct 21, 2012 at 8:30 AM, Rita <[EMAIL PROTECTED]> | >>>> wrote: | >>>> | >>>>> Hi, | >>>>> | >>>>> Was curious if there was a method to measure the total number | >>>>> of IOPS | >>>> (I/O | >>>>> operations per second) on a HDFS cluster. | >>>>> | >>>>> | >>>>> | >>>>> -- | >>>>> --- Get your facts first, then you can distort them as you | >>>>> please.-- | >>>>> | >>>> | >>>> | >>>> | >>>> -- | >>>> --- Get your facts first, then you can distort them as you | >>>> please.-- | >>>> | >>> | >>> | >>> | >>> -- | >>> --- Get your facts first, then you can distort them as you | >>> please.-- | >> | >> | > | > | > -- | > --- Get your facts first, then you can distort them as you | > please.-- | | |