Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Hang when add/remove a datanode into/from a 2 datanode cluster


Copy link to this message
-
Re: Hang when add/remove a datanode into/from a 2 datanode cluster
Harsh J 2013-06-21, 13:59
The dfs.replication is a per-file parameter. If you have a client that
does not use the supplied configs, then its default replication is 3
and all files it will create (as part of the app or via a job config)
will be with replication factor 3.

You can do an -lsr to find all files and filter which ones have been
created with a factor of 3 (versus expected config of 2).

On Fri, Jun 21, 2013 at 3:13 PM, sam liu <[EMAIL PROTECTED]> wrote:
> Hi George,
>
> Actually, in my hdfs-site.xml, I always set 'dfs.replication'to 2. But still
> encounter this issue.
>
> Thanks!
>
>
> 2013/6/21 George Kousiouris <[EMAIL PROTECTED]>
>>
>>
>> Hi,
>>
>> I think i have faced this before, the problem is that you have the rep
>> factor=3 so it seems to hang because it needs 3 nodes to achieve the factor
>> (replicas are not created on the same node). If you set the replication
>> factor=2 i think you will not have this issue. So in general you must make
>> sure that the rep factor is <= to the available datanodes.
>>
>> BR,
>> George
>>
>>
>> On 6/21/2013 12:29 PM, sam liu wrote:
>>
>> Hi,
>>
>> I encountered an issue which hangs the decommission operatoin. Its steps:
>> 1. Install a Hadoop 1.1.1 cluster, with 2 datanodes: dn1 and dn2. And, in
>> hdfs-site.xml, set the 'dfs.replication' to 2
>> 2. Add node dn3 into the cluster as a new datanode, and did not change the
>> 'dfs.replication' value in hdfs-site.xml and keep it as 2
>> note: step 2 passed
>> 3. Decommission dn3 from the cluster
>>
>> Expected result: dn3 could be decommissioned successfully
>>
>> Actual result: decommission progress hangs and the status always be
>> 'Waiting DataNode status: Decommissioned'
>>
>> However, if the initial cluster includes >= 3 datanodes, this issue won't
>> be encountered when add/remove another datanode.
>>
>> Also, after step 2, I noticed that some block's expected replicas is 3,
>> but the 'dfs.replication' value in hdfs-site.xml is always 2!
>>
>> Could anyone pls help provide some triages?
>>
>> Thanks in advance!
>>
>>
>>
>> --
>> ---------------------------
>>
>> George Kousiouris, PhD
>> Electrical and Computer Engineer
>> Division of Communications,
>> Electronics and Information Engineering
>> School of Electrical and Computer Engineering
>> Tel: +30 210 772 2546
>> Mobile: +30 6939354121
>> Fax: +30 210 772 2569
>> Email: [EMAIL PROTECTED]
>> Site: http://users.ntua.gr/gkousiou/
>>
>> National Technical University of Athens
>> 9 Heroon Polytechniou str., 157 73 Zografou, Athens, Greece
>
>

--
Harsh J