Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Sizing help


Copy link to this message
-
Re: Sizing help
Depending on which distribution and what your data center power limits are
you may save a lot of money by going with machines that have 12 x 2 or 3 tb
drives.  With suitable engineering margins and 3 x replication you can have
5 tb net data per node and 20 nodes per rack.  If you want to go all cowboy
with 2x replication and little space to spare then you can double that
density.

On Monday, November 7, 2011, Rita <[EMAIL PROTECTED]> wrote:
> For a 1PB installation you would need close to 170 servers with 12 TB
disk pack installed on them (with replication factor of 2). Thats a
conservative estimate
> CPUs: 4 cores with 16gb of memory
>
> Namenode: 4 core with 32gb of memory should be ok.
>
>
> On Fri, Oct 21, 2011 at 5:40 PM, Steve Ed <[EMAIL PROTECTED]> wrote:
>>
>> I am a newbie to Hadoop and trying to understand how to Size a Hadoop
cluster.
>>
>>
>>
>> What are factors I should consider deciding the number of datanodes ?
>>
>> Datanode configuration ?  CPU, Memory
>>
>> Amount of memory required for namenode ?
>>
>>
>>
>> My client is looking at 1 PB of  usable data and will be running
analytics on TB size files using mapreduce.
>>
>>
>>
>>
>>
>> Thanks
>>
>> ….. Steve
>>
>>
>
>
> --
> --- Get your facts first, then you can distort them as you please.--
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB