Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Hadoop harddrive space usage


Copy link to this message
-
Re: Hadoop harddrive space usage
Hi Jean,
Hadoop will not factor in number of disks or directories, but rather mainly
allocated free space.  Hadoop will do its best to spread the data across
evenly amongst the nodes.  For instance, let's say you had 3 datanodes
(replication factor 1) and all have allocated 10GB each, but one of the
nodes split the 10GB into two directories.  Now if we try to store a file
that takes up 3 blocks, Hadoop will just place 1 block in each node.

Hope that helps.

Regards,
Robert

On Fri, Dec 28, 2012 at 9:12 AM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> Quick question regarding hard drive space usage.
>
> Hadoop will distribute the data evenly on the cluster. So all the
> nodes are going to receive almost the same quantity of data to store.
>
> Now, if on one node I have 2 directories configured, is hadoop going
> to assign twice the quantity on this node? Or is each directory going
> to receive half the load?
>
> Thanks,
>
> JM
>