Hadoop will not factor in number of disks or directories, but rather mainly
allocated free space. Hadoop will do its best to spread the data across
evenly amongst the nodes. For instance, let's say you had 3 datanodes
(replication factor 1) and all have allocated 10GB each, but one of the
nodes split the 10GB into two directories. Now if we try to store a file
that takes up 3 blocks, Hadoop will just place 1 block in each node.
Hope that helps.
On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:
> Quick question regarding hard drive space usage.
> Hadoop will distribute the data evenly on the cluster. So all the
> nodes are going to receive almost the same quantity of data to store.
> Now, if on one node I have 2 directories configured, is hadoop going
> to assign twice the quantity on this node? Or is each directory going
> to receive half the load?