Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Re: Hardware Selection for Hadoop


Copy link to this message
-
Re: Hardware Selection for Hadoop
Regards, Raj. To know that data that you want to process with Hadoop is
critical for this, at least an approximation of the data. I think that
Hadoop Operations is an invaluable resource for this:

- Hadoop use heavily RAM, so, the first resource that you have to consider
is to use all available RAM that you could give to the nodes, with a marked
focus on the NameNode/JobTracker Node.

- For the DataNode/TaskTracker nodes, is very good to have fast disks, like
SSDs but they are expensive, so you can consider this too. For me WD
Barracuda are awesome.

- A good network connection between the nodes. Hadoop is a RCP-based
platform, so a good network is critical for a healthy cluster

A good start for me is for a small cluster:

- NN/JT: 8 to 16 GB RAM
- DN/TT: 4 to 8 GB RAM

Consider to use always compression, to optimize the communication between
all services in your Hadoop cluster (Snappy is my favorite)

All these advices are in the Hadoop Operations book from Eric, so, it´s
must-read for every Hadoop System Engineer.

2013/4/29 Raj Hadoop <[EMAIL PROTECTED]>

>    Hi,
>
> I have to propose some hardware requirements in my company for a Proof of
> Concept with Hadoop. I was reading Hadoop Operations and also saw Cloudera
> Website. But just wanted to know from the group - what is the requirements
> if I have to plan for a 5 node cluster. I dont know at this time, the data
> that need to be processed at this time for the Proof of Concept. So - can
> you suggest something to me?
>
> Regards,
> Raj
>

--
Marcos Ortiz Valmaseda,
*Data-Driven Product Manager* at PDVSA
*Blog*: http://dataddict.wordpress.com/
*LinkedIn: *http://www.linkedin.com/in/marcosluis2186
*Twitter*: @marcosluis2186 <http://twitter.com/marcosluis2186>