Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> what will happen if a backup name node folder becomes unaccessible?


Copy link to this message
-
Re: what will happen if a backup name node folder becomes unaccessible?
On Fri, Aug 27, 2010 at 8:30 PM, jiang licht <[EMAIL PROTECTED]> wrote:
> The same behavior is seen in CDH3 hadoop-0.20.2+228 if a mounted nfs folder for dfs.name.dir is not available when a name node starts...
>
> Michael
>
> --- On Fri, 8/27/10, Edward Capriolo <[EMAIL PROTECTED]> wrote:
>
> From: Edward Capriolo <[EMAIL PROTECTED]>
> Subject: Re: what will happen if a backup name node folder becomes unaccessible?
> To: [EMAIL PROTECTED]
> Date: Friday, August 27, 2010, 6:57 PM
>
> On Tue, Aug 24, 2010 at 7:59 PM, Sudhir Vallamkondu
> <[EMAIL PROTECTED]> wrote:
>> The cloudera distribution seems to be working fine when a dfs.name.dir
>> directory is inaccessible in midst of namenode running.
>>
>> See below
>>
>> hadoop@training-vm:~$ hadoop version
>> Hadoop 0.20.1+152
>> Subversion  -r c15291d10caa19c2355f437936c7678d537adf94
>> Compiled by root on Mon Nov  2 05:15:37 UTC 2009
>>
>> hadoop@training-vm:~$ jps
>> 8923 Jps
>> 8548 JobTracker
>> 8467 SecondaryNameNode
>> 8250 NameNode
>> 8357 DataNode
>> 8642 TaskTracker
>>
>> hadoop@training-vm:~$ /usr/lib/hadoop/bin/stop-all.sh
>> stopping jobtracker
>> localhost: stopping tasktracker
>> stopping namenode
>> localhost: stopping datanode
>> localhost: stopping secondarynamenode
>>
>> hadoop@training-vm:~$ mkdir edit_log_dir1
>>
>> hadoop@training-vm:~$ mkdir edit_log_dir2
>>
>> hadoop@training-vm:~$ ls
>> edit_log_dir1  edit_log_dir2
>>
>> hadoop@training-vm:~$ ls -ltr /var/lib/hadoop-0.20/cache/hadoop/dfs/name
>> total 8
>> drwxr-xr-x 2 hadoop hadoop 4096 2009-10-15 16:17 image
>> drwxr-xr-x 2 hadoop hadoop 4096 2010-08-24 15:56 current
>>
>> hadoop@training-vm:~$ cp -r /var/lib/hadoop-0.20/cache/hadoop/dfs/name
>> edit_log_dir1
>>
>> hadoop@training-vm:~$ cp -r /var/lib/hadoop-0.20/cache/hadoop/dfs/name
>> edit_log_dir2
>>
>> ------ hdfs-site.xml added new dirs
>>
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>
>> <configuration>
>>  <property>
>>    <name>dfs.replication</name>
>>    <value>1</value>
>>  </property>
>>  <property>
>>     <name>dfs.permissions</name>
>>     <value>false</value>
>>  </property>
>>  <property>
>>     <!-- specify this so that running 'hadoop namenode -format' formats the
>> right dir -->
>>     <name>dfs.name.dir</name>
>> <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name,/home/hadoop/edit_log_dir1
>> ,/home/hadoop/edit_log_dir2</value>
>>  </property>
>>   <property>
>>     <name>fs.checkpoint.period</name>
>>     <value>600</value>
>>  </property>
>>  <property>
>>    <name>dfs.namenode.plugins</name>
>>    <value>org.apache.hadoop.thriftfs.NamenodePlugin</value>
>>  </property>
>>  <property>
>>    <name>dfs.datanode.plugins</name>
>>    <value>org.apache.hadoop.thriftfs.DatanodePlugin</value>
>>  </property>
>>  <property>
>>    <name>dfs.thrift.address</name>
>>    <value>0.0.0.0:9090</value>
>>  </property>
>> </configuration>
>>
>> ---- start all daemons
>>
>> hadoop@training-vm:~$ /usr/lib/hadoop/bin/start-all.sh
>> starting namenode, logging to
>> /usr/lib/hadoop/bin/../logs/hadoop-hadoop-namenode-training-vm.out
>> localhost: starting datanode, logging to
>> /usr/lib/hadoop/bin/../logs/hadoop-hadoop-datanode-training-vm.out
>> localhost: starting secondarynamenode, logging to
>> /usr/lib/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-training-vm.out
>> starting jobtracker, logging to
>> /usr/lib/hadoop/bin/../logs/hadoop-hadoop-jobtracker-training-vm.out
>> localhost: starting tasktracker, logging to
>> /usr/lib/hadoop/bin/../logs/hadoop-hadoop-tasktracker-training-vm.out
>>
>>
>> -------- namenode log confirms all dirs taken
>>
>> 2010-08-24 16:20:48,718 INFO
>> org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
>> /************************************************************
>> STARTUP_MSG: Starting NameNode
>> STARTUP_MSG:   host = training-vm/127.0.0.1
>> STARTUP_MSG:   args = []
>> STARTUP_MSG:   version = 0.20.1+152
>> STARTUP_MSG:   build =  -r c15291d10caa19c2355f437936c7678d537adf94;
NFS is not exactly equal to a local file system. For example you can
soft or hard mount an NFS file system, and your system will react
differently if the NFS mount vanishes. On some operating systems a
hard mount will cause an un-interpretable wait. soft mounts, which I
believe is the linux default, react differently when the NFS server
vanish and that could explain the errors you are getting.

If you take the approach that NFS works "exactly" a local file system
you often will be disappointed.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB