Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> recommendation on HDDs


Copy link to this message
-
Re: recommendation on HDDs
Bandwidth is definitely better with more active spindles.  I would recommend
several larger disks.  The cost is very nearly the same.

On Fri, Feb 11, 2011 at 3:52 PM, Shrinivas Joshi <[EMAIL PROTECTED]>wrote:

> Thanks for your inputs, Michael.  We have 6 open SATA ports on the
> motherboards. That is the reason why we are thinking of 4 to 5 data disks
> and 1 OS disk.
> Are you suggesting use of one 2TB disk instead of four 500GB disks lets
> say?
> I thought that the HDFS utilization/throughput increases with the # of
> disks
> per node (assuming that the total usable IO bandwidth increases
> proportionally).
>
> -Shrinivas
>
> On Thu, Feb 10, 2011 at 4:25 PM, Michael Segel <[EMAIL PROTECTED]
> >wrote:
>
> >
> > Shrinivas,
> >
> > Assuming you're in the US, I'd recommend the following:
> >
> > Go with 2TB 7200 SATA hard drives.
> > (Not sure what type of hardware you have)
> >
> > What  we've found is that in the data nodes, there's an optimal
> > configuration that balances price versus performance.
> >
> > While your chasis may hold 8 drives, how many open SATA ports are on the
> > motherboard? Since you're using JBOD, you don't want the additional
> expense
> > of having to purchase a separate controller card for the additional
> drives.
> >
> > I'm running Seagate drives at home and I haven't had any problems for
> > years.
> > When you look at your drive, you need to know total storage, speed
> (rpms),
> > and cache size.
> > Looking at Microcenter's pricing... 2TB 3.0GB SATA Hitachi was $110.00 A
> > 1TB Seagate was 70.00
> > A 250GB SATA drive was $45.00
> >
> > So 2TB = 110, 140, 180 (respectively)
> >
> > So you get a better deal on 2TB.
> >
> > So if you go out and get more drives but of lower density, you'll end up
> > spending more money and use more energy, but I doubt you'll see a real
> > performance difference.
> >
> > The other thing is that if you want to add more disk, you have room to
> > grow. (Just add more disk and restart the node, right?)
> > If all of your disk slots are filled, you're SOL. You have to take out
> the
> > box, replace all of the drives, then add to cluster as 'new' node.
> >
> > Just my $0.02 cents.
> >
> > HTH
> >
> > -Mike
> >
> > > Date: Thu, 10 Feb 2011 15:47:16 -0600
> > > Subject: Re: recommendation on HDDs
> > > From: [EMAIL PROTECTED]
> > > To: [EMAIL PROTECTED]
> > >
> > > Hi Ted, Chris,
> > >
> > > Much appreciate your quick reply. The reason why we are looking for
> > smaller
> > > capacity drives is because we are not anticipating a huge growth in
> data
> > > footprint and also read somewhere that larger the capacity of the
> drive,
> > > bigger the number of platters in them and that could affect drive
> > > performance. But looks like you can get 1TB drives with only 2
> platters.
> > > Large capacity drives should be OK for us as long as they perform
> equally
> > > well.
> > >
> > > Also, the systems that we have can host up to 8 SATA drives in them. In
> > that
> > > case, would  backplanes offer additional advantages?
> > >
> > > Any suggestions on 5400 vs. 7200 vs. 10000 RPM disks?  I guess 10K rpm
> > disks
> > > would be overkill comparing their perf/cost advantage?
> > >
> > > Thanks for your inputs.
> > >
> > > -Shrinivas
> > >
> > > On Thu, Feb 10, 2011 at 2:48 PM, Chris Collins <
> > [EMAIL PROTECTED]>wrote:
> > >
> > > > Of late we have had serious issues with seagate drives in our hadoop
> > > > cluster.  These were purchased over several purchasing cycles and
> > pretty
> > > > sure it wasnt just a single "bad batch".   Because of this we
> switched
> > to
> > > > buying 2TB hitachi drives which seem to of been considerably more
> > reliable.
> > > >
> > > > Best
> > > >
> > > > C
> > > > On Feb 10, 2011, at 12:43 PM, Ted Dunning wrote:
> > > >
> > > > > Get bigger disks.  Data only grows and having extra is always good.
> > > > >
> > > > > You can get 2TB drives for <$100 and 1TB for < $75.
> > > > >
> > > > > As far as transfer rates are concerned, any 3GB/s SATA drive is
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB