Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: Hadoop harddrive space usage


Copy link to this message
-
Re: Hadoop harddrive space usage
Perfect, thanks. It's what I was looking for.

I have few nodes, all with 2TB drives, but one with 2x1TB. Which mean
that at the end, for Hadoop, it's almost the same thing.

JM

2012/12/28, Robert Molina <[EMAIL PROTECTED]>:
> Hi Jean,
> Hadoop will not factor in number of disks or directories, but rather mainly
> allocated free space.  Hadoop will do its best to spread the data across
> evenly amongst the nodes.  For instance, let's say you had 3 datanodes
> (replication factor 1) and all have allocated 10GB each, but one of the
> nodes split the 10GB into two directories.  Now if we try to store a file
> that takes up 3 blocks, Hadoop will just place 1 block in each node.
>
> Hope that helps.
>
> Regards,
> Robert
>
> On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> Quick question regarding hard drive space usage.
>>
>> Hadoop will distribute the data evenly on the cluster. So all the
>> nodes are going to receive almost the same quantity of data to store.
>>
>> Now, if on one node I have 2 directories configured, is hadoop going
>> to assign twice the quantity on this node? Or is each directory going
>> to receive half the load?
>>
>> Thanks,
>>
>> JM
>>
>