Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Hardware Selection for Hadoop


Copy link to this message
-
Re: Hardware Selection for Hadoop
Michael Segel 2013-05-07, 12:53
I wouldn't.

You end up with a 'Frankencluster' which could become problematic down the road.

Ever try to debug a port failure on a switch? (It does happen and its a bitch.)
Note that you say 'reliable'... older hardware may or may not be reliable.... or under warranty.
(How many here build their own servers from the components up?  ;-)

I'm not suggesting that you go out and buy a 10 core cpu, however, depending on who you are, and what your budget is... it may make sense. o
Even for a proof of concept. ;-)

While we have a rough metric on spindles to cores, you end up putting a stress on the disk controllers. YMMV.

As to spending $$$ on hardware for  a PoC, its not only relative... but also what makes you think this is the first PoC and only PoC he's going to do? The point is that hardware is reusable and it also sets a pattern for what the future cluster will look like. After this PoC, why not look at Storm, Mesos, Spark, Shark, etc...

Trust me, as someone who has had to fight for allocation of hardware dollars for R&D... get the best bang you can for your buck.

HTH

-Mike

On May 6, 2013, at 5:57 PM, Patai Sangbutsarakum <[EMAIL PROTECTED]> wrote:

> I really doubt if he would spend $ to by 10 cores on a die CPU for "proof of concept" machines.
> Actually, I even think of telling you to gathering old machines (but reliable) as much as you can collect.
> Put as much as disks, Ram you can. teaming up NIC if you can, and at that point you can proof your concept up to certain point.
>
> You will get the idea how is your application will behave, how big of the data set you will play with
> is the application cpu or io bound, and from that you can go out shopping buy the best fit server configuration.
>
>
>
> On May 6, 2013, at 4:17 AM, Michel Segel <[EMAIL PROTECTED]> wrote:
>
>> 8 physical cores is so 2009 - 2010 :-)
>>
>> Intel now offers a chip w 10 physical cores on a die.
>> You are better off thinking of 4-8 GB per physical core.
>> It depends on what you want to do, and what you think you may want to do...
>>
>> It also depends on the price points of the hardware. Memory, drives, CPUs (price by clock speeds...) you just need to find the right optimum between price and performance...
>>
>>
>> Sent from a remote device. Please excuse any typos...
>>
>> Mike Segel
>>
>> On May 5, 2013, at 1:47 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> Data nodes normally are also task nodes.  With 8 physical cores it isn't that unreasonable to have 64GB whereas 24GB really is going to pinch.
>>>
>>> Achieving highest performance requires that you match the capabilities of your nodes including CPU, memory, disk and networking.  The standard wisdom is 4-6GB of RAM per core, at least a spindle per core and 1/2 to 2/3 of disk bandwidth available as network bandwidth.
>>>
>>> If you look at the different configurations mentioned in this thread, you will see different limitations.
>>>
>>> For instance:
>>>
>>> 2 x Quad cores Intel
>>> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
>>> 64GB mem                <==== slightly larger than necessary
>>> 2 1GBe NICs teaming     <==== 2 x 100 MB << 400MB = 2/3 x 6 x 100MB
>>>
>>> This configuration is mostly limited by networking bandwidth
>>>
>>> 2 x Quad cores Intel
>>> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
>>> 24GB mem                <==== 24GB << 8 x 6GB
>>> 2 10GBe NICs teaming    <==== 2 x 1000 MB > 400MB = 2/3 x 6 x 100MB
>>>  
>>> This configuration is weak on disk relative to CPU and very weak on disk relative to network speed.  The worst problem, however, is likely to be small memory.  This will likely require us to decrease the number of slots by half or more making it impossible to even use the 6 disks that we have and making the network even more outrageously over-provisioned.
>>>  
>>>
>>>
>>>
>>> On Sun, May 5, 2013 at 9:41 AM, Rahul Bhattacharjee <[EMAIL PROTECTED]> wrote:
>>> IMHO ,64 G looks bit high for DN. 24 should be good enough for DN.