Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: Sane max storage size for DN


Copy link to this message
-
Re: Sane max storage size for DN
Mohammad Tariq 2012-12-13, 05:27
Hello Chris,

     Thank you so much for the valuable insights. I was actually using the
same principle. I did the blunder and did the maths for entire (9*3)PB.

Seems I am higher than you, that too without drinking ;)

Many thanks.
Regards,
    Mohammad Tariq

On Thu, Dec 13, 2012 at 10:38 AM, Chris Embree <[EMAIL PROTECTED]> wrote:

> Hi Mohammed,
>
> The amount of RAM on the NN is related to the number of blocks... so let's
> do some math. :)  1G of RAM to 1M blocks seems to be the general rule.
>
> I'll probably mess this up so someone check my math:
>
> 9 PT ~ 9,216 TB ~ 9,437,184 GB of data.  Let's put that in 128MB blocks:
>  according to kcalc that's 75,497,472 of 128 MB Blocks.
> Unless I missed this by an order of magnitude (entirely possible... I've
> been drinking since 6), that sound like 76G of RAM (above OS requirements).
>  128G should kick it's ass; 256G seems like a waste of $$.
>
> Hmm... That makes the NN sound extremely efficient.  Someone validate me
> or kick me to the curb.
>
> YMMV ;)
>
>
> On Wed, Dec 12, 2012 at 10:52 PM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>
>> Hello Michael,
>>
>>       It's an array. The actual size of the data could be somewhere
>> around 9PB(exclusive of replication) and we want to keep the no of DNs as
>> less as possible. Computations are not too frequent, as I have specified
>> earlier. If I have 500TB in 1 DN, the no of DNs would be around 49. And, if
>> the block size is 128MB, the no of blocks would be 201326592. So, I was
>> thinking of having 256GB RAM for the NN. Does this make sense to you?
>>
>> Many thanks.
>>
>> Regards,
>>     Mohammad Tariq
>>
>>
>>
>> On Thu, Dec 13, 2012 at 12:28 AM, Michael Segel <
>> [EMAIL PROTECTED]> wrote:
>>
>>> 500 TB?
>>>
>>> How many nodes in the cluster? Is this attached storage or is it in an
>>> array?
>>>
>>> I mean if you have 4 nodes for a total of 2PB, what happens when you
>>> lose 1 node?
>>>
>>>
>>> On Dec 12, 2012, at 9:02 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:
>>>
>>> Hello list,
>>>
>>>           I don't know if this question makes any sense, but I would
>>> like to ask, does it make sense to store 500TB (or more) data in a single
>>> DN?If yes, then what should be the spec of other parameters *viz*. NN &
>>> DN RAM, N/W etc?If no, what could be the alternative?
>>>
>>> Many thanks.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>>
>>>
>>>
>>>
>>
>