Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - decommissioning nodes help


+
Arun Ramakrishnan 2010-07-08, 23:59
+
Varene Olivier 2010-07-09, 14:44
+
Arun Ramakrishnan 2010-07-09, 17:56
+
Varene Olivier 2010-07-13, 08:32
+
Arun Ramakrishnan 2010-07-13, 21:46
Copy link to this message
-
Re: decommissioning nodes help
Allen Wittenauer 2010-07-13, 22:43

Do you have a topology defined?

On Jul 13, 2010, at 2:46 PM, Arun Ramakrishnan wrote:

> I don't know where the problem was. J-D said somewhere that decommissioning process is well tested and less likely to have bugs.
>
> Anyways, I just resorted to killing 2 nodes. Wait till fsck reports 100% replication to 3. Kill 2 more nodes ... and so on.
> Worked fine.
>
> Thanks
> Arun
>
> -----Original Message-----
> From: Varene Olivier [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, July 13, 2010 1:32 AM
> To: [EMAIL PROTECTED]
> Subject: Re: decommissioning nodes help
>
> Are your datanodes double attached to the network ?
> If this is the case, you can indeed see your datanodes as double entries.
> You should also check the match between your DNS resolution and the
> hostname of your datanodes.
>
>
> To solve your issue, you can switch off one data node at a time (by
> killing) the process.
> The master should see that and perform action to maintain the
> replication level.
> Do it slowly :) (or you might loose some data)
> You can have an idea if the process is over or not if the io on block
> writing is over
>
> Cheers
>
>
> Arun Ramakrishnan a écrit :
>> That's what I thought.
>>
>> But,this was what I see in -report for the excluded nodes.
>>
>> **************
>> ecommission Status : Normal
>> Configured Capacity: 0 (0 KB)
>> DFS Used: 0 (0 KB)
>> Non DFS Used: 0 (0 KB)
>> DFS Remaining: 0(0 KB)
>> DFS Used%: 100%
>> DFS Remaining%: 0%
>> Last contact: Wed Dec 31 16:00:00 PST 1969
>> ***************
>>
>> In the UI, the excluded nodes show up in both live and dead nodes. And its been several hours now. The block counts across the nodes is exactly the same.
>> The cluster is not accessed by any clients, its not busy at all.
>>
>> And I have set dfs.balance.bandwidthPerSec = 2000000 in hdfs-site.xml
>>
>> Anyway, I think I am lost here. Am just resorting to killing 2 nodes at a time sorta backwardish strategy. At least I know it works.
>>
>> Thanks
>> Arun
>>
>> -----Original Message-----
>> From: Varene Olivier [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, July 09, 2010 7:44 AM
>> To: [EMAIL PROTECTED]
>> Subject: Re: decommissioning nodes help
>>
>> Hello,
>>
>> you should see in the Web interface
>>
>> http://yourDatanodeMaster:50070/
>> the status of your node to Decommissioning
>> when done, it is removed from the list of active nodes
>>
>> With a huge bandwith to perform the sync, the process is very fast
>> so, to answer your other mail, process might be done
>>
>> you can also this the status of your node via CLI
>>
>> # hadoop dfsadmin -report
>>
>> Name : ...
>> Decommission Status : <StatusOfYourNode>
>> ...
>>
>>
>> Hope it helps
>>
>>
>>
>> Arun Ramakrishnan a écrit :
>>> Hi guys
>>>
>>> I am a stuck in my attempt to remove nodes from hdfs.
>>>
>>> I followed the steps in https://issues.apache.org/jira/browse/HDFS-1125
>>>
>>> a)     add node to dfs.hosts.exclude
>>>
>>> b)      dfsadmin -refreshNodes
>>>
>>> c)      wait for decom to finish
>>>
>>> d)     remove node from both dfs.hosts and dfs.hosts.exclude
>>>
>>>
>>>
>>> But after step a) and b) how do I know if decommission is complete.
>>>
>>> I am in the process of decommissioning 6 nodes and don't want to loose
>>> any blocks ( rep factor is 3 ) with a restart.
>>>
>>>
>>>
>>> I also opened https://issues.apache.org/jira/browse/HDFS-1290 if anyone
>>> is interested.
>>>
>>>
>>>
>>> Thanks
>>>
>>> Arun
>>>
>>>
>>>
+
Varene Olivier 2010-07-15, 14:30