Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - recommendation on HDDs


Copy link to this message
-
Re: recommendation on HDDs
Ted Dunning 2011-02-12, 00:14
Bandwidth is definitely better with more active spindles.  I would recommend
several larger disks.  The cost is very nearly the same.

On Fri, Feb 11, 2011 at 3:52 PM, Shrinivas Joshi <[EMAIL PROTECTED]>wrote:

> Thanks for your inputs, Michael.  We have 6 open SATA ports on the
> motherboards. That is the reason why we are thinking of 4 to 5 data disks
> and 1 OS disk.
> Are you suggesting use of one 2TB disk instead of four 500GB disks lets
> say?
> I thought that the HDFS utilization/throughput increases with the # of
> disks
> per node (assuming that the total usable IO bandwidth increases
> proportionally).
>
> -Shrinivas
>
> On Thu, Feb 10, 2011 at 4:25 PM, Michael Segel <[EMAIL PROTECTED]
> >wrote:
>
> >
> > Shrinivas,
> >
> > Assuming you're in the US, I'd recommend the following:
> >
> > Go with 2TB 7200 SATA hard drives.
> > (Not sure what type of hardware you have)
> >
> > What  we've found is that in the data nodes, there's an optimal
> > configuration that balances price versus performance.
> >
> > While your chasis may hold 8 drives, how many open SATA ports are on the
> > motherboard? Since you're using JBOD, you don't want the additional
> expense
> > of having to purchase a separate controller card for the additional
> drives.
> >
> > I'm running Seagate drives at home and I haven't had any problems for
> > years.
> > When you look at your drive, you need to know total storage, speed
> (rpms),
> > and cache size.
> > Looking at Microcenter's pricing... 2TB 3.0GB SATA Hitachi was $110.00 A
> > 1TB Seagate was 70.00
> > A 250GB SATA drive was $45.00
> >
> > So 2TB = 110, 140, 180 (respectively)
> >
> > So you get a better deal on 2TB.
> >
> > So if you go out and get more drives but of lower density, you'll end up
> > spending more money and use more energy, but I doubt you'll see a real
> > performance difference.
> >
> > The other thing is that if you want to add more disk, you have room to
> > grow. (Just add more disk and restart the node, right?)
> > If all of your disk slots are filled, you're SOL. You have to take out
> the
> > box, replace all of the drives, then add to cluster as 'new' node.
> >
> > Just my $0.02 cents.
> >
> > HTH
> >
> > -Mike
> >
> > > Date: Thu, 10 Feb 2011 15:47:16 -0600
> > > Subject: Re: recommendation on HDDs
> > > From: [EMAIL PROTECTED]
> > > To: [EMAIL PROTECTED]
> > >
> > > Hi Ted, Chris,
> > >
> > > Much appreciate your quick reply. The reason why we are looking for
> > smaller
> > > capacity drives is because we are not anticipating a huge growth in
> data
> > > footprint and also read somewhere that larger the capacity of the
> drive,
> > > bigger the number of platters in them and that could affect drive
> > > performance. But looks like you can get 1TB drives with only 2
> platters.
> > > Large capacity drives should be OK for us as long as they perform
> equally
> > > well.
> > >
> > > Also, the systems that we have can host up to 8 SATA drives in them. In
> > that
> > > case, would  backplanes offer additional advantages?
> > >
> > > Any suggestions on 5400 vs. 7200 vs. 10000 RPM disks?  I guess 10K rpm
> > disks
> > > would be overkill comparing their perf/cost advantage?
> > >
> > > Thanks for your inputs.
> > >
> > > -Shrinivas
> > >
> > > On Thu, Feb 10, 2011 at 2:48 PM, Chris Collins <
> > [EMAIL PROTECTED]>wrote:
> > >
> > > > Of late we have had serious issues with seagate drives in our hadoop
> > > > cluster.  These were purchased over several purchasing cycles and
> > pretty
> > > > sure it wasnt just a single "bad batch".   Because of this we
> switched
> > to
> > > > buying 2TB hitachi drives which seem to of been considerably more
> > reliable.
> > > >
> > > > Best
> > > >
> > > > C
> > > > On Feb 10, 2011, at 12:43 PM, Ted Dunning wrote:
> > > >
> > > > > Get bigger disks.  Data only grows and having extra is always good.
> > > > >
> > > > > You can get 2TB drives for <$100 and 1TB for < $75.
> > > > >
> > > > > As far as transfer rates are concerned, any 3GB/s SATA drive is