Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Hardware Selection for Hadoop


Copy link to this message
-
Re: Hardware Selection for Hadoop
Thanks Mohit and Ted!
On Mon, May 6, 2013 at 9:11 AM, Rahul Bhattacharjee <[EMAIL PROTECTED]
> wrote:

> OK. I do not know if I understand the spindle / core thing. I will dig
> more into that.
>
> Thanks for the info.
>
> One more thing , whats the significance of multiple NIC.
>
> Thanks,
> Rahul
>
>
> On Mon, May 6, 2013 at 12:17 AM, Ted Dunning <[EMAIL PROTECTED]>wrote:
>
>>
>> Data nodes normally are also task nodes.  With 8 physical cores it isn't
>> that unreasonable to have 64GB whereas 24GB really is going to pinch.
>>
>> Achieving highest performance requires that you match the capabilities of
>> your nodes including CPU, memory, disk and networking.  The standard wisdom
>> is 4-6GB of RAM per core, at least a spindle per core and 1/2 to 2/3 of
>> disk bandwidth available as network bandwidth.
>>
>> If you look at the different configurations mentioned in this thread, you
>> will see different limitations.
>>
>> For instance:
>>
>> 2 x Quad cores Intel
>>> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
>>> 64GB mem                <==== slightly larger than necessary
>>> 2 1GBe NICs teaming     <==== 2 x 100 MB << 400MB = 2/3 x 6 x 100MB
>>
>>
>> This configuration is mostly limited by networking bandwidth
>>
>> 2 x Quad cores Intel
>>> 2-3 TB x 6 SATA         <==== 6 disk < desired 8 or more
>>> 24GB mem                <==== 24GB << 8 x 6GB
>>> 2 10GBe NICs teaming    <==== 2 x 1000 MB > 400MB = 2/3 x 6 x 100MB
>>
>>
>> This configuration is weak on disk relative to CPU and very weak on disk
>> relative to network speed.  The worst problem, however, is likely to be
>> small memory.  This will likely require us to decrease the number of slots
>> by half or more making it impossible to even use the 6 disks that we have
>> and making the network even more outrageously over-provisioned.
>>
>>
>>
>>
>> On Sun, May 5, 2013 at 9:41 AM, Rahul Bhattacharjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> IMHO ,64 G looks bit high for DN. 24 should be good enough for DN.
>>>
>>>
>>> On Tue, Apr 30, 2013 at 12:19 AM, Patai Sangbutsarakum <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>>  2 x Quad cores Intel
>>>> 2-3 TB x 6 SATA
>>>> 64GB mem
>>>> 2 NICs teaming
>>>>
>>>>  my 2 cents
>>>>
>>>>
>>>>  On Apr 29, 2013, at 9:24 AM, Raj Hadoop <[EMAIL PROTECTED]>
>>>>  wrote:
>>>>
>>>>      Hi,
>>>>
>>>> I have to propose some hardware requirements in my company for a Proof
>>>> of Concept with Hadoop. I was reading Hadoop Operations and also saw
>>>> Cloudera Website. But just wanted to know from the group - what is the
>>>> requirements if I have to plan for a 5 node cluster. I dont know at this
>>>> time, the data that need to be processed at this time for the Proof of
>>>> Concept. So - can you suggest something to me?
>>>>
>>>> Regards,
>>>> Raj
>>>>
>>>>
>>>>
>>>
>>
>