Perfect, thanks. It's what I was looking for.
I have few nodes, all with 2TB drives, but one with 2x1TB. Which mean
that at the end, for Hadoop, it's almost the same thing.
2012/12/28, Robert Molina <[EMAIL PROTECTED]>:
> Hi Jean,
> Hadoop will not factor in number of disks or directories, but rather mainly
> allocated free space. Hadoop will do its best to spread the data across
> evenly amongst the nodes. For instance, let's say you had 3 datanodes
> (replication factor 1) and all have allocated 10GB each, but one of the
> nodes split the 10GB into two directories. Now if we try to store a file
> that takes up 3 blocks, Hadoop will just place 1 block in each node.
> Hope that helps.
> On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>> Quick question regarding hard drive space usage.
>> Hadoop will distribute the data evenly on the cluster. So all the
>> nodes are going to receive almost the same quantity of data to store.
>> Now, if on one node I have 2 directories configured, is hadoop going
>> to assign twice the quantity on this node? Or is each directory going
>> to receive half the load?