Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - decommissioning node woes


Copy link to this message
-
decommissioning node woes
Rita 2011-03-17, 01:36
Hello,

I have been struggling with decommissioning data  nodes. I have a 50+ data
node cluster (no MR) with each server holding about 2TB of storage. I split
the nodes into 2 racks.
I edit the 'exclude' file and then do a -refreshNodes. I see the node
immediate in 'Decommiosied node' and I also see it as a 'live' node!
Eventhough I wait 24+ hours its still like this. I am suspecting its a bug
in my version.  The data node process is still running on the node I am
trying to decommission. So, sometimes I kill -9 the process and I see the
'under replicated' blocks...this can't be the normal procedure.

There were even times that I had corrupt blocks because I was impatient --
waited 24-34 hours

I am using 23 August, 2010: release 0.21.0
<http://hadoop.apache.org/hdfs/releases.html#23+August%2C+2010%3A+release+0.21.0+available>
 version.

Is this a known bug? Is there anything else I need to do to decommission a
node?

--
--- Get your facts first, then you can distort them as you please.--