Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # general - Dedicated disk for operating system


+
Oded Rosen 2011-08-10, 09:22
+
Allen Wittenauer 2011-08-10, 13:50
+
Oded Rosen 2011-08-10, 14:25
+
Evert Lammerts 2011-08-10, 14:56
+
Scott Carey 2011-08-10, 17:24
+
Ted Dunning 2011-08-10, 17:40
+
Luke Lu 2011-08-10, 19:19
+
Brian Bockelman 2011-08-10, 19:31
+
Ted Dunning 2011-08-10, 19:44
+
Steve Loughran 2011-08-13, 19:23
+
Ted Dunning 2011-08-10, 19:49
+
Rajiv Chittajallu 2011-08-11, 00:15
+
Ted Dunning 2011-08-11, 06:13
+
Steve Loughran 2011-08-13, 19:30
Copy link to this message
-
Re: Dedicated disk for operating system
Allen Wittenauer 2011-08-10, 19:04

On Aug 10, 2011, at 7:56 AM, Evert Lammerts wrote:

> A short, slightly off-topic question:
>
>>      Also note that in this configuration that one cannot take
>> advantage of the "keep the machine up at all costs" features in newer
>> Hadoop's, which require that root, swap, and the log area be mirrored
>> to be truly effective.  I'm not quite convinced that those features are
>> worth it yet for anything smaller than maybe a 12 disk config.
>
> Dell and Cloudera promote the C2100. I'd like to see the calculations behind that config.

If Dell is shipping the same box they shipped us to test a few months ago, the performance was pretty horrid vs. almost all their competitors.  The main problem was the controller--it was built for RAID, not for JBOD.  (... and then there is the OOB support...)
> Am I wrong thinking that keeping your cluster up with such dense nodes will only work if you have many (order of magnitude 100+) of them, and interconnected with 10Gb Ethernet? If you don't then recovery times from failing disks / rack switches are going to get crazy, right?

If one assumes that a bunch of nodes are failing at once, yes.  The irony is that ops teams tend to group repairs, so keeping them up might actually be the wrong thing in relation to actual practice.

> If you want to get bang for buck, don't the proportions "disk IO / processing power", "node storage capacity / ethernet speed" and "total amount of nodes / ethernet speed", indicate many small nodes with not too many disks and 1Gb Ethernet?

The biggest constraint is almost always RAM, as you can use it to help with the rest.
+
Scott Carey 2011-08-10, 17:40