Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Diskspace usage


Copy link to this message
-
Re: Diskspace usage
Hi,

HDFS by default writes to disk in round robin fashion, which would
mean your 256 GB will indeed fill up faster than the rest over time.

When a disk is full, the disk is ignored from writes until some block
data is deleted from it (as part of regular HDFS file deletes).
However, blocks existing on the disk are still used for reads as
normal. The disk isn't marked failed or ejected, but just not written
to anymore (i.e. not selected for round robins of block writes) until
it is usable again.

So tl;dr: This is no-worry situation.

However, is this 256 GB a root disk (i.e. OS disk)? We usually
recommend not using the OS disk for DN data storage as there's chance
of fill-ups if misconfigured, that could lead to weird issues with the
OS showing up.

On Fri, Nov 23, 2012 at 1:41 AM, Jean-Marc Spaggiari
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Quick question on the way hadoop is using the disk space.
>
> Let's say I have 8 nodes. 7 of them with a 2T disk, and one with a 256GB.
>
> Is hadoop going to use the 256GB until it's full, then continue with
> the other nodes only but keeping the 256GB live? Or will it bring the
> 256GB node down when it will be full (like for failures) and continue
> with the 7 remaining nodes?
>
> To summarize, is hadoop taking care of the drive size?
>
> Thanks,
>
> JM

--
Harsh J
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB