|
|
-
Is it possible to recover name node just using the meta data in a new machine
Jeff Zhang 2010-05-21, 04:35
Hi all,
I'd like to recover the name node in another new machine. I copy the meta data from the old name node to the new name node, and then modify the configuration (including the fs.default.name). Then I stop the old dfs cluster, and restart the new dfs cluster on the new machine, then I get the following error message ( anyone has any ideas ?)
2010-05-21 01:02:37,552 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 500702010-05-21 01:02:37,554 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: ReplicationMonitor thread received InterruptedException.java.lang.InterruptedException: sleep interrupted2010-05-21 01:02:37,555 INFO org.apache.hadoop.hdfs.server.namenode.DecommissionManager: Interrupted Monitorjava.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager. java:65) at java.lang.Thread.run(Thread.java:619) 2010-05-21 01:02:37,556 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactio ns: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 0 2010-05-21 01:02:37,565 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000 2010-05-21 01:02:37,566 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Cannot assign requested address at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer.start(HttpServer.java:424) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:246) -- Best Regards
Jeff Zhang
-
Re: Is it possible to recover name node just using the meta data in a new machine
Todd Lipcon 2010-05-21, 04:39
The new machine apparently has something listening on one of the NN ports.
try sudo fuser -n tcp 50070 (will tell you which pid is listening on that port)
-Todd
On Thu, May 20, 2010 at 9:35 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
> Hi all, > > I'd like to recover the name node in another new machine. I copy the > meta data from the old name node to the new name node, and then modify > the configuration (including the fs.default.name). Then I stop the old > dfs cluster, and restart the new dfs cluster on the new machine, then > I get the following error message ( anyone has any ideas ?) > > 2010-05-21 01:02:37,552 INFO org.apache.hadoop.http.HttpServer: Port > returned by webServer.getConnectors()[0].getLocalPort() before open() > is -1. Opening the listener on 500702010-05-21 01:02:37,554 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > ReplicationMonitor thread received > InterruptedException.java.lang.InterruptedException: sleep > interrupted2010-05-21 01:02:37,555 INFO > org.apache.hadoop.hdfs.server.namenode.DecommissionManager: > Interrupted Monitorjava.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager. > java:65) > at java.lang.Thread.run(Thread.java:619) > 2010-05-21 01:02:37,556 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of > transactio > ns: 0 Total time for transactions(ms): 0Number of transactions batched > in Syncs: 0 Number of syncs: 0 > SyncTimes(ms): 0 0 > 2010-05-21 01:02:37,565 INFO org.apache.hadoop.ipc.Server: Stopping > server on 9000 > 2010-05-21 01:02:37,566 ERROR > org.apache.hadoop.hdfs.server.namenode.NameNode: > java.net.BindException: > Cannot assign requested address > at sun.nio.ch.Net.bind(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > at org.apache.hadoop.http.HttpServer.start(HttpServer.java:424) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:246) > > > -- > Best Regards > > Jeff Zhang >
-- Todd Lipcon Software Engineer, Cloudera
-
Re: Is it possible to recover name node just using the meta data in a new machine
Jeff Zhang 2010-05-21, 04:50
Hi Todd,
I try the command, but no process is using the port 50070
On Fri, May 21, 2010 at 12:39 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > The new machine apparently has something listening on one of the NN ports. > try sudo fuser -n tcp 50070 > (will tell you which pid is listening on that port) > -Todd > > On Thu, May 20, 2010 at 9:35 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: >> >> Hi all, >> >> I'd like to recover the name node in another new machine. I copy the >> meta data from the old name node to the new name node, and then modify >> the configuration (including the fs.default.name). Then I stop the old >> dfs cluster, and restart the new dfs cluster on the new machine, then >> I get the following error message ( anyone has any ideas ?) >> >> 2010-05-21 01:02:37,552 INFO org.apache.hadoop.http.HttpServer: Port >> returned by webServer.getConnectors()[0].getLocalPort() before open() >> is -1. Opening the listener on 500702010-05-21 01:02:37,554 WARN >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >> ReplicationMonitor thread received >> InterruptedException.java.lang.InterruptedException: sleep >> interrupted2010-05-21 01:02:37,555 INFO >> org.apache.hadoop.hdfs.server.namenode.DecommissionManager: >> Interrupted Monitorjava.lang.InterruptedException: sleep interrupted >> at java.lang.Thread.sleep(Native Method) >> at >> org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager. >> java:65) >> at java.lang.Thread.run(Thread.java:619) >> 2010-05-21 01:02:37,556 INFO >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >> transactio >> ns: 0 Total time for transactions(ms): 0Number of transactions batched >> in Syncs: 0 Number of syncs: 0 >> SyncTimes(ms): 0 0 >> 2010-05-21 01:02:37,565 INFO org.apache.hadoop.ipc.Server: Stopping >> server on 9000 >> 2010-05-21 01:02:37,566 ERROR >> org.apache.hadoop.hdfs.server.namenode.NameNode: >> java.net.BindException: >> Cannot assign requested address >> at sun.nio.ch.Net.bind(Native Method) >> at >> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) >> at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) >> at >> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) >> at org.apache.hadoop.http.HttpServer.start(HttpServer.java:424) >> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:246) >> >> >> -- >> Best Regards >> >> Jeff Zhang > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
-- Best Regards
Jeff Zhang
-
Re: Is it possible to recover name node just using the meta data in a new machine
Todd Lipcon 2010-05-21, 04:58
On Thu, May 20, 2010 at 9:50 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote:
> Hi Todd, > > I try the command, but no process is using the port 50070 > > Double check fs.default.name and dfs.http.address are both pointing to a domain name which resolve on that machine to a local IP address. It seems to think you're trying to bind to a nonlocal IP.
-Todd > > > On Fri, May 21, 2010 at 12:39 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > The new machine apparently has something listening on one of the NN > ports. > > try sudo fuser -n tcp 50070 > > (will tell you which pid is listening on that port) > > -Todd > > > > On Thu, May 20, 2010 at 9:35 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: > >> > >> Hi all, > >> > >> I'd like to recover the name node in another new machine. I copy the > >> meta data from the old name node to the new name node, and then modify > >> the configuration (including the fs.default.name). Then I stop the old > >> dfs cluster, and restart the new dfs cluster on the new machine, then > >> I get the following error message ( anyone has any ideas ?) > >> > >> 2010-05-21 01:02:37,552 INFO org.apache.hadoop.http.HttpServer: Port > >> returned by webServer.getConnectors()[0].getLocalPort() before open() > >> is -1. Opening the listener on 500702010-05-21 01:02:37,554 WARN > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > >> ReplicationMonitor thread received > >> InterruptedException.java.lang.InterruptedException: sleep > >> interrupted2010-05-21 01:02:37,555 INFO > >> org.apache.hadoop.hdfs.server.namenode.DecommissionManager: > >> Interrupted Monitorjava.lang.InterruptedException: sleep interrupted > >> at java.lang.Thread.sleep(Native Method) > >> at > >> > org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager. > >> java:65) > >> at java.lang.Thread.run(Thread.java:619) > >> 2010-05-21 01:02:37,556 INFO > >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of > >> transactio > >> ns: 0 Total time for transactions(ms): 0Number of transactions batched > >> in Syncs: 0 Number of syncs: 0 > >> SyncTimes(ms): 0 0 > >> 2010-05-21 01:02:37,565 INFO org.apache.hadoop.ipc.Server: Stopping > >> server on 9000 > >> 2010-05-21 01:02:37,566 ERROR > >> org.apache.hadoop.hdfs.server.namenode.NameNode: > >> java.net.BindException: > >> Cannot assign requested address > >> at sun.nio.ch.Net.bind(Native Method) > >> at > >> > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) > >> at > sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > >> at > >> > org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) > >> at org.apache.hadoop.http.HttpServer.start(HttpServer.java:424) > >> at > >> > org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:246) > >> > >> > >> -- > >> Best Regards > >> > >> Jeff Zhang > > > > > > > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > > > > -- > Best Regards > > Jeff Zhang >
-- Todd Lipcon Software Engineer, Cloudera
-
Re: Is it possible to recover name node just using the meta data in a new machine
Jeff Zhang 2010-05-21, 05:19
Thanks Todd, I forgot to change dfs.http.address
On Fri, May 21, 2010 at 12:58 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > On Thu, May 20, 2010 at 9:50 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: >> >> Hi Todd, >> >> I try the command, but no process is using the port 50070 >> > > Double check fs.default.name and dfs.http.address are both pointing to a > domain name which resolve on that machine to a local IP address. It seems to > think you're trying to bind to a nonlocal IP. > -Todd > >> >> On Fri, May 21, 2010 at 12:39 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: >> > The new machine apparently has something listening on one of the NN >> > ports. >> > try sudo fuser -n tcp 50070 >> > (will tell you which pid is listening on that port) >> > -Todd >> > >> > On Thu, May 20, 2010 at 9:35 PM, Jeff Zhang <[EMAIL PROTECTED]> wrote: >> >> >> >> Hi all, >> >> >> >> I'd like to recover the name node in another new machine. I copy the >> >> meta data from the old name node to the new name node, and then modify >> >> the configuration (including the fs.default.name). Then I stop the old >> >> dfs cluster, and restart the new dfs cluster on the new machine, then >> >> I get the following error message ( anyone has any ideas ?) >> >> >> >> 2010-05-21 01:02:37,552 INFO org.apache.hadoop.http.HttpServer: Port >> >> returned by webServer.getConnectors()[0].getLocalPort() before open() >> >> is -1. Opening the listener on 500702010-05-21 01:02:37,554 WARN >> >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >> >> ReplicationMonitor thread received >> >> InterruptedException.java.lang.InterruptedException: sleep >> >> interrupted2010-05-21 01:02:37,555 INFO >> >> org.apache.hadoop.hdfs.server.namenode.DecommissionManager: >> >> Interrupted Monitorjava.lang.InterruptedException: sleep interrupted >> >> at java.lang.Thread.sleep(Native Method) >> >> at >> >> >> >> org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(DecommissionManager. >> >> java:65) >> >> at java.lang.Thread.run(Thread.java:619) >> >> 2010-05-21 01:02:37,556 INFO >> >> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of >> >> transactio >> >> ns: 0 Total time for transactions(ms): 0Number of transactions batched >> >> in Syncs: 0 Number of syncs: 0 >> >> SyncTimes(ms): 0 0 >> >> 2010-05-21 01:02:37,565 INFO org.apache.hadoop.ipc.Server: Stopping >> >> server on 9000 >> >> 2010-05-21 01:02:37,566 ERROR >> >> org.apache.hadoop.hdfs.server.namenode.NameNode: >> >> java.net.BindException: >> >> Cannot assign requested address >> >> at sun.nio.ch.Net.bind(Native Method) >> >> at >> >> >> >> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) >> >> at >> >> sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) >> >> at >> >> >> >> org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) >> >> at org.apache.hadoop.http.HttpServer.start(HttpServer.java:424) >> >> at >> >> >> >> org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:246) >> >> >> >> >> >> -- >> >> Best Regards >> >> >> >> Jeff Zhang >> > >> > >> > >> > -- >> > Todd Lipcon >> > Software Engineer, Cloudera >> > >> >> >> >> -- >> Best Regards >> >> Jeff Zhang > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
-- Best Regards
Jeff Zhang
|
|