|
|
+
Arun Ramakrishnan 2010-07-08, 23:59
+
Varene Olivier 2010-07-09, 14:44
+
Arun Ramakrishnan 2010-07-09, 17:56
+
Varene Olivier 2010-07-13, 08:32
+
Arun Ramakrishnan 2010-07-13, 21:46
-
Re: decommissioning nodes helpAllen Wittenauer 2010-07-13, 22:43
Do you have a topology defined? On Jul 13, 2010, at 2:46 PM, Arun Ramakrishnan wrote: > I don't know where the problem was. J-D said somewhere that decommissioning process is well tested and less likely to have bugs. > > Anyways, I just resorted to killing 2 nodes. Wait till fsck reports 100% replication to 3. Kill 2 more nodes ... and so on. > Worked fine. > > Thanks > Arun > > -----Original Message----- > From: Varene Olivier [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, July 13, 2010 1:32 AM > To: [EMAIL PROTECTED] > Subject: Re: decommissioning nodes help > > Are your datanodes double attached to the network ? > If this is the case, you can indeed see your datanodes as double entries. > You should also check the match between your DNS resolution and the > hostname of your datanodes. > > > To solve your issue, you can switch off one data node at a time (by > killing) the process. > The master should see that and perform action to maintain the > replication level. > Do it slowly :) (or you might loose some data) > You can have an idea if the process is over or not if the io on block > writing is over > > Cheers > > > Arun Ramakrishnan a écrit : >> That's what I thought. >> >> But,this was what I see in -report for the excluded nodes. >> >> ************** >> ecommission Status : Normal >> Configured Capacity: 0 (0 KB) >> DFS Used: 0 (0 KB) >> Non DFS Used: 0 (0 KB) >> DFS Remaining: 0(0 KB) >> DFS Used%: 100% >> DFS Remaining%: 0% >> Last contact: Wed Dec 31 16:00:00 PST 1969 >> *************** >> >> In the UI, the excluded nodes show up in both live and dead nodes. And its been several hours now. The block counts across the nodes is exactly the same. >> The cluster is not accessed by any clients, its not busy at all. >> >> And I have set dfs.balance.bandwidthPerSec = 2000000 in hdfs-site.xml >> >> Anyway, I think I am lost here. Am just resorting to killing 2 nodes at a time sorta backwardish strategy. At least I know it works. >> >> Thanks >> Arun >> >> -----Original Message----- >> From: Varene Olivier [mailto:[EMAIL PROTECTED]] >> Sent: Friday, July 09, 2010 7:44 AM >> To: [EMAIL PROTECTED] >> Subject: Re: decommissioning nodes help >> >> Hello, >> >> you should see in the Web interface >> >> http://yourDatanodeMaster:50070/ >> the status of your node to Decommissioning >> when done, it is removed from the list of active nodes >> >> With a huge bandwith to perform the sync, the process is very fast >> so, to answer your other mail, process might be done >> >> you can also this the status of your node via CLI >> >> # hadoop dfsadmin -report >> >> Name : ... >> Decommission Status : <StatusOfYourNode> >> ... >> >> >> Hope it helps >> >> >> >> Arun Ramakrishnan a écrit : >>> Hi guys >>> >>> I am a stuck in my attempt to remove nodes from hdfs. >>> >>> I followed the steps in https://issues.apache.org/jira/browse/HDFS-1125 >>> >>> a) add node to dfs.hosts.exclude >>> >>> b) dfsadmin -refreshNodes >>> >>> c) wait for decom to finish >>> >>> d) remove node from both dfs.hosts and dfs.hosts.exclude >>> >>> >>> >>> But after step a) and b) how do I know if decommission is complete. >>> >>> I am in the process of decommissioning 6 nodes and don't want to loose >>> any blocks ( rep factor is 3 ) with a restart. >>> >>> >>> >>> I also opened https://issues.apache.org/jira/browse/HDFS-1290 if anyone >>> is interested. >>> >>> >>> >>> Thanks >>> >>> Arun >>> >>> >>> +
Varene Olivier 2010-07-15, 14:30
|