|
|
-
Re: Sane max storage size for DNMohammad Tariq 2012-12-13, 15:18
Thank you so much Hemanth.
Regards, Mohammad Tariq On Thu, Dec 13, 2012 at 8:21 PM, Hemanth Yamijala <[EMAIL PROTECTED] > wrote: > This is a dated blog post, so it would help if someone with current HDFS > knowledge can validate it: > http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_hadoop_dist/ > . > > There is a bit about the RAM required for the Namenode and how to compute > it: > > You can look at the 'Namespace limitations' section. > > Thanks > hemanth > > > On Thu, Dec 13, 2012 at 10:57 AM, Mohammad Tariq <[EMAIL PROTECTED]>wrote: > >> Hello Chris, >> >> Thank you so much for the valuable insights. I was actually using >> the same principle. I did the blunder and did the maths for entire (9*3)PB. >> >> Seems I am higher than you, that too without drinking ;) >> >> Many thanks. >> >> >> Regards, >> Mohammad Tariq >> >> >> >> On Thu, Dec 13, 2012 at 10:38 AM, Chris Embree <[EMAIL PROTECTED]> wrote: >> >>> Hi Mohammed, >>> >>> The amount of RAM on the NN is related to the number of blocks... so >>> let's do some math. :) 1G of RAM to 1M blocks seems to be the general rule. >>> >>> I'll probably mess this up so someone check my math: >>> >>> 9 PT ~ 9,216 TB ~ 9,437,184 GB of data. Let's put that in 128MB blocks: >>> according to kcalc that's 75,497,472 of 128 MB Blocks. >>> Unless I missed this by an order of magnitude (entirely possible... I've >>> been drinking since 6), that sound like 76G of RAM (above OS requirements). >>> 128G should kick it's ass; 256G seems like a waste of $$. >>> >>> Hmm... That makes the NN sound extremely efficient. Someone validate me >>> or kick me to the curb. >>> >>> YMMV ;) >>> >>> >>> On Wed, Dec 12, 2012 at 10:52 PM, Mohammad Tariq <[EMAIL PROTECTED]>wrote: >>> >>>> Hello Michael, >>>> >>>> It's an array. The actual size of the data could be somewhere >>>> around 9PB(exclusive of replication) and we want to keep the no of DNs as >>>> less as possible. Computations are not too frequent, as I have specified >>>> earlier. If I have 500TB in 1 DN, the no of DNs would be around 49. And, if >>>> the block size is 128MB, the no of blocks would be 201326592. So, I was >>>> thinking of having 256GB RAM for the NN. Does this make sense to you? >>>> >>>> Many thanks. >>>> >>>> Regards, >>>> Mohammad Tariq >>>> >>>> >>>> >>>> On Thu, Dec 13, 2012 at 12:28 AM, Michael Segel < >>>> [EMAIL PROTECTED]> wrote: >>>> >>>>> 500 TB? >>>>> >>>>> How many nodes in the cluster? Is this attached storage or is it in an >>>>> array? >>>>> >>>>> I mean if you have 4 nodes for a total of 2PB, what happens when you >>>>> lose 1 node? >>>>> >>>>> >>>>> On Dec 12, 2012, at 9:02 AM, Mohammad Tariq <[EMAIL PROTECTED]> >>>>> wrote: >>>>> >>>>> Hello list, >>>>> >>>>> I don't know if this question makes any sense, but I would >>>>> like to ask, does it make sense to store 500TB (or more) data in a single >>>>> DN?If yes, then what should be the spec of other parameters *viz*. NN >>>>> & DN RAM, N/W etc?If no, what could be the alternative? >>>>> >>>>> Many thanks. >>>>> >>>>> Regards, >>>>> Mohammad Tariq >>>>> >>>>> >>>>> >>>>> >>>> >>> >> > |