-Re: Disk on data node full
Yanbo Liang 2012-03-25, 03:32
I wonder why this unbalance produce?
2012/3/17 Zizon Qiu <[EMAIL PROTECTED]>
> if there are only dfs files under /data and /data2,it will be ok when
> filled up.
> unless some other files like mapreduce teme folder or even a namenode
> image,it may broken the cluster when disk was filled up(as namenode can not
> do a checkpoint or mapreduce framework can not continue as no disk space
> for intermediate files).
> 1) bring down HDFS and just manually move ~50% of the
> /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS
> back up
> moving around the files may work,but I not sure.
> the datanode MAY report it back to namenode the updated location.
> 2) bring a data node down one at a time, clean our /data and /data2, put
> the node back into rotation and let the balancer distribute replication
> data back onto the node and since it will round robin to both (now empty)
> disks, I will wind up with a nicely balanced data node. Repeat this process
> for the remaining nodes.
> this works fine.
> Your may config the *dfs.datanode.du.reserved* to setup volume quota
> for each datanode volume.but take care of the formula hadoop used to
> calculate the free disk space.
> On Sat, Mar 17, 2012 at 8:57 PM, Tom Wilberding <[EMAIL PROTECTED]>wrote:
>> Hi there,
>> Our data nodes all have 2 disks, one which is nearly full and one which
>> is nearly empty:
>> $ df -h
>> Filesystem Size Used Avail Use% Mounted on
>> 120G 11G 104G 9% /
>> /dev/cciss/c0d0p1 99M 35M 60M 37% /boot
>> tmpfs 7.9G 0 7.9G 0% /dev/shm
>> /dev/cciss/c0d1 1.8T 1.7T 103G 95% /data
>> /dev/cciss/c0d2 1.8T 76G 1.8T 5% /data2
>> Reading through the docs and mailing list archives, my understanding is
>> that HDFS will continue to round robin to both disks until /data is
>> completely full and then only write to /data2. Is this correct? Does it
>> really write until the disk is 100% full (or as close to full as possible?)
>> Ignoring performance of this situation and the monitoring hassles of
>> having full disks, I just want to be sure that nothing bad is going to
>> happen over the next couple of days as we fill up that /data partition.
>> I understand that my best two options to rebalance each data node would
>> be to either:
>> 1) bring down HDFS and just manually move ~50% of the
>> /data/dfs/dn/current/subdir* directories over to /data2 and then bring HDFS
>> back up
>> 2) bring a data node down one at a time, clean our /data and /data2, put
>> the node back into rotation and let the balancer distribute replication
>> data back onto the node and since it will round robin to both (now empty)
>> disks, I will wind up with a nicely balanced data node. Repeat this process
>> for the remaining nodes.
>> I'm relatively new to HDFS, so can someone please confirm whether what
>> I'm saying is correct? Any tips, tricks or things to watch out for would
>> also be greatly appreciated.