Thank you for the such detailed information,
By the way what type of Disk Controller do you use?
On Tue, Oct 2, 2012 at 6:34 AM, Alexander Pivovarov <[EMAIL PROTECTED]>wrote:
> Privet Oleg
> Cloudera and Dell setup the following cluster for my company
> Company receives 1.5 TB raw data per day
> 38 data nodes + 2 Name Nodes
> Data Node:
> Dell PowerEdge C2100 series
> 2 x XEON x5670
> 48 GB RAM ECC (12x4GB 1333MHz)
> 12 x 2 TB 7200 RPM SATA HDD (with hot swap) JBOD
> Intel Gigabit ET Dual port PCIe x4
> Redundant Power Supply
> Hadoop CDH3
> max map tasks 24
> max reduce tasks 8
> Name Node and Secondary Name Node are the similar but
> 96GB RAM (not sure why)
> 6x600Gb 15 RPM Serial SCSI
> another config is here
> page 298
> you probably need just 1 computer with 10 x 2 TB SATA HDD
> On Mon, Oct 1, 2012 at 6:02 PM, Oleg Ruchovets <[EMAIL PROTECTED]>
> > Hi ,
> > We are on a very early stage of our hadoop project and want to do a
> > We have ~ 5-6 terabytes of row data and we are going to execute some
> > aggregations.
> > We plan to use 8 - 10 machines
> > Questions:
> > 1) Which hardware should we use:
> > a) How many discs , what discs is better to use?
> > b) How many RAM?
> > c) How many CPUs?
> > 2) Please share best practices and tips / tricks related to utilise
> > hardware using for hadoop projects.
> > Thanks in advance
> > Oleg.