-Re: Hang when add/remove a datanode into/from a 2 datanode cluster
sam liu 2013-06-21, 09:43
Actually, in my hdfs-site.xml, I always set 'dfs.replication'to 2. But
still encounter this issue.
2013/6/21 George Kousiouris <[EMAIL PROTECTED]>
> I think i have faced this before, the problem is that you have the rep
> factor=3 so it seems to hang because it needs 3 nodes to achieve the factor
> (replicas are not created on the same node). If you set the replication
> factor=2 i think you will not have this issue. So in general you must make
> sure that the rep factor is <= to the available datanodes.
> On 6/21/2013 12:29 PM, sam liu wrote:
> I encountered an issue which hangs the decommission operatoin. Its steps:
> 1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And,
> in hdfs-site.xml, set the 'dfs.replication' to 2
> 2. Add node dn3 into the cluster as a new datanode, and did not change
> the 'dfs.replication' value in hdfs-site.xml and keep it as 2
> note: step 2 passed
> 3. Decommission dn3 from the cluster
> Expected result: dn3 could be decommissioned successfully
> Actual result: decommission progress hangs and the status always be
> 'Waiting DataNode status: Decommissioned'
> However, if the initial cluster includes >= 3 datanodes, this issue
> won't be encountered when add/remove another datanode.
> Also, after step 2, I noticed that some block's expected replicas is 3,
> but the 'dfs.replication' value in hdfs-site.xml is always 2!
> Could anyone pls help provide some triages?
> Thanks in advance!
> George Kousiouris, PhD
> Electrical and Computer Engineer
> Division of Communications,
> Electronics and Information Engineering
> School of Electrical and Computer Engineering
> Tel: +30 210 772 2546
> Mobile: +30 6939354121
> Fax: +30 210 772 2569
> Email: [EMAIL PROTECTED]
> Site: http://users.ntua.gr/gkousiou/
> National Technical University of Athens
> 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece