|
|
-
Error using hadoop distcp
praveenesh kumar 2011-10-05, 05:15
I am trying to use distcp to copy a file from one HDFS to another.
But while copying I am getting the following exception :
hadoop distcp hdfs://ub13:54310/user/hadoop/weblog hdfs://ub16:54310/user/hadoop/weblog
11/10/05 10:41:01 INFO mapred.JobClient: Task Id : attempt_201110031447_0005_m_000007_0, Status : FAILED java.net.UnknownHostException: unknown host: ub16 at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195) at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) at org.apache.hadoop.ipc.Client.call(Client.java:720) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) at org.apache.hadoop.mapred.Child.main(Child.java:170)
Its saying its not finding ub16. But the entry is there in /etc/hosts files. I am able to ssh both the machines. Do I need password less ssh between these two NNs ? What can be the issue ? Any thing I am missing before using distcp ?
Thanks, Praveenesh
-
Re: Error using hadoop distcp
trang van anh 2011-10-05, 07:06
which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster-> add "ub16" entry in /etc/hosts on where the task running. On 10/5/2011 12:15 PM, praveenesh kumar wrote: > I am trying to use distcp to copy a file from one HDFS to another. > > But while copying I am getting the following exception : > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > hdfs://ub16:54310/user/hadoop/weblog > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > attempt_201110031447_0005_m_000007_0, Status : FAILED > java.net.UnknownHostException: unknown host: ub16 > at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > at org.apache.hadoop.ipc.Client.call(Client.java:720) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy1.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215) > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > at > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) > at > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > Its saying its not finding ub16. But the entry is there in /etc/hosts files. > I am able to ssh both the machines. Do I need password less ssh between > these two NNs ? > What can be the issue ? Any thing I am missing before using distcp ? > > Thanks, > Praveenesh >
-
Re: Error using hadoop distcp
bejoy.hadoop@... 2011-10-05, 08:25
Hi praveenesh Can you try repeating the distcp using IP instead of host name. From the error looks like an RPC exception not able to identify the host, so I believe it can't be due to not setting a password less ssh. Just try it out. Regards Bejoy K S
-----Original Message----- From: trang van anh <[EMAIL PROTECTED]> Date: Wed, 05 Oct 2011 14:06:11 To: <[EMAIL PROTECTED]> Reply-To: [EMAIL PROTECTED] Subject: Re: Error using hadoop distcp
which host run the task that throws the exception ? ensure that each data node know another data nodes in hadoop cluster-> add "ub16" entry in /etc/hosts on where the task running. On 10/5/2011 12:15 PM, praveenesh kumar wrote: > I am trying to use distcp to copy a file from one HDFS to another. > > But while copying I am getting the following exception : > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > hdfs://ub16:54310/user/hadoop/weblog > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > attempt_201110031447_0005_m_000007_0, Status : FAILED > java.net.UnknownHostException: unknown host: ub16 > at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > at org.apache.hadoop.ipc.Client.call(Client.java:720) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy1.getProtocolVersion(Unknown Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215) > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177) > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > at > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) > at > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124) > at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > Its saying its not finding ub16. But the entry is there in /etc/hosts files. > I am able to ssh both the machines. Do I need password less ssh between > these two NNs ? > What can be the issue ? Any thing I am missing before using distcp ? > > Thanks, > Praveenesh >
-
Re: Error using hadoop distcp
praveenesh kumar 2011-10-05, 09:25
I tried that thing also.. when I am using IP address, its saying I should use hostname.
*hadoop@ub13:~$ hadoop distcp hdfs://162.192.100.53:54310/user/hadoop/webloghdfs:// 162.192.100.16:54310/user/hadoop/weblog* 11/10/05 14:53:50 INFO tools.DistCp: srcPaths=[hdfs:// 162.192.100.53:54310/user/hadoop/weblog] 11/10/05 14:53:50 INFO tools.DistCp: destPath=hdfs:// 162.192.100.16:54310/user/hadoop/weblog java.lang.IllegalArgumentException: Wrong FS: hdfs:// 162.192.100.53:54310/user/hadoop/weblog, expected: hdfs://ub13:54310 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310) at org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99) at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:464) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:648) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:621) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:638) at org.apache.hadoop.tools.DistCp.run(DistCp.java:857) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:884)
I have the entries of both machines in /etc/hosts... On Wed, Oct 5, 2011 at 1:55 PM, <[EMAIL PROTECTED]> wrote:
> Hi praveenesh > Can you try repeating the distcp using IP instead of host name. > From the error looks like an RPC exception not able to identify the host, so > I believe it can't be due to not setting a password less ssh. Just try it > out. > Regards > Bejoy K S > > -----Original Message----- > From: trang van anh <[EMAIL PROTECTED]> > Date: Wed, 05 Oct 2011 14:06:11 > To: <[EMAIL PROTECTED]> > Reply-To: [EMAIL PROTECTED] > Subject: Re: Error using hadoop distcp > > which host run the task that throws the exception ? ensure that each > data node know another data nodes in hadoop cluster-> add "ub16" entry > in /etc/hosts on where the task running. > On 10/5/2011 12:15 PM, praveenesh kumar wrote: > > I am trying to use distcp to copy a file from one HDFS to another. > > > > But while copying I am getting the following exception : > > > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > > hdfs://ub16:54310/user/hadoop/weblog > > > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > > attempt_201110031447_0005_m_000007_0, Status : FAILED > > java.net.UnknownHostException: unknown host: ub16 > > at > org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195) > > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > > at org.apache.hadoop.ipc.Client.call(Client.java:720) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at $Proxy1.getProtocolVersion(Unknown Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177) > > at > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > at > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > at > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > > at > > > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48) > > at > > > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)
-
Re: Error using hadoop distcp
Uma Maheswara Rao G 72686... 2011-10-11, 05:35
Distcp will run as mapreduce job. Here tasktrackers required the hostname mappings to contact to other nodes. Please configure the mapping correctly in both the machines and try. egards, Uma
----- Original Message ----- From: trang van anh <[EMAIL PROTECTED]> Date: Wednesday, October 5, 2011 1:41 pm Subject: Re: Error using hadoop distcp To: [EMAIL PROTECTED]
> which host run the task that throws the exception ? ensure that > each > data node know another data nodes in hadoop cluster-> add "ub16" > entry > in /etc/hosts on where the task running. > On 10/5/2011 12:15 PM, praveenesh kumar wrote: > > I am trying to use distcp to copy a file from one HDFS to another. > > > > But while copying I am getting the following exception : > > > > hadoop distcp hdfs://ub13:54310/user/hadoop/weblog > > hdfs://ub16:54310/user/hadoop/weblog > > > > 11/10/05 10:41:01 INFO mapred.JobClient: Task Id : > > attempt_201110031447_0005_m_000007_0, Status : FAILED > > java.net.UnknownHostException: unknown host: ub16 > > at > org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)> > at org.apache.hadoop.ipc.Client.getConnection(Client.java:850) > > at org.apache.hadoop.ipc.Client.call(Client.java:720) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at $Proxy1.getProtocolVersion(Unknown Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:113)> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:215) > > at > org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)> > at > > > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)> at > > > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > > at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175) > > at > > > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:48)> at > > > org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:124)> at org.apache.hadoop.mapred.Task.runJobSetupTask(Task.java:835) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:296) > > at org.apache.hadoop.mapred.Child.main(Child.java:170) > > > > Its saying its not finding ub16. But the entry is there in > /etc/hosts files. > > I am able to ssh both the machines. Do I need password less ssh > between> these two NNs ? > > What can be the issue ? Any thing I am missing before using > distcp ? > > > > Thanks, > > Praveenesh > > > >
|
|