|
|
-
Pig accessing remote hadoop cluster
Iman E 2012-03-28, 17:19
Hello all, I used to run pig on the same node where the hadoop job tracker is running and everything was fine. Now I am trying to run pig on my laptop to access the cluster where hadoop is running but this alternative fails. I am running pig 0.8. I have copied the hadoop configuration directory to my local machine and have pointed pig to use its configuration files. Yet, pig fails to establish the connection with the remote hadoop jobtracker. Any suggestions of what should i do to fix this error and get it connect to the remote jobtracker? The error returned is as follows:
2012-03-28 10:00:25,614 [main] INFO org.apache.pig.Main - Logging error messages to: /home/xxx/pig_1332943225604.log 2012-03-28 10:00:25,809 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://XXX.XXX.XXX.XXX:54311 2012-03-28 10:00:46,922 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 0 time(s). 2012-03-28 10:01:07,935 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 1 time(s).
The error appearing in the log files is as follows:
Error before Pig is launched ---------------------------- ERROR 2999: Unexpected internal error. Failed to create DataStorage
java.lang.RuntimeException: Failed to create DataStorage at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133) at org.apache.pig.impl.PigContext.connect(PigContext.java:183) at org.apache.pig.PigServer.<init>(PigServer.java:233) at org.apache.pig.PigServer.<init>(PigServer.java:222) at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55) at org.apache.pig.Main.run(Main.java:462) at org.apache.pig.Main.main(Main.java:107) Caused by: java.net.SocketTimeoutException: Call to /XXX.XXX.XXX.XXX :54311 failed on socket timeout exception: java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/XXX.XXX.XXX.XXX :54311] at org.apache.hadoop.ipc.Client.wrapException(Client.java:771) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) ... 9 more Caused by: java.net.SocketTimeoutException: 20000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/XXX.XXX.XXX.XXX :54311] at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:304) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:860) at org.apache.hadoop.ipc.Client.call(Client.java:720) ... 22 more Thanks
-
Re: Pig accessing remote hadoop cluster
Prashant Kommireddi 2012-03-28, 17:24
Are you able to ping Namenode from your localhost?
-Prashant
On Mar 28, 2012, at 10:19 AM, Iman E <[EMAIL PROTECTED]> wrote:
> Hello all, > I used to run pig on the same node where the hadoop job tracker is running and everything was fine. Now I am trying to run pig on my laptop to access the cluster where hadoop is running but this alternative fails. > I am running pig 0.8. I have copied the hadoop configuration directory to my local machine and have pointed pig to use its configuration files. Yet, pig fails to establish the connection with the remote hadoop jobtracker. Any suggestions of what should i do to fix this error and get it connect to the remote jobtracker? > > > The error returned is as follows: > > 2012-03-28 10:00:25,614 [main] INFO org.apache.pig.Main - Logging error messages to: > /home/xxx/pig_1332943225604.log > 2012-03-28 10:00:25,809 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > Connecting to hadoop file system at: hdfs://XXX.XXX.XXX.XXX:54311 > 2012-03-28 > 10:00:46,922 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 0 time(s). > 2012-03-28 > 10:01:07,935 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 1 time(s). > > > > The error appearing in the log files is as follows: > > Error before Pig is launched > ---------------------------- > ERROR > 2999: Unexpected internal error. Failed to create DataStorage > > java.lang.RuntimeException: > Failed to create DataStorage > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213) > > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133) > > at org.apache.pig.impl.PigContext.connect(PigContext.java:183) > > at > org.apache.pig.PigServer.<init>(PigServer.java:233) > at > org.apache.pig.PigServer.<init>(PigServer.java:222) > at > org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55) > at > org.apache.pig.Main.run(Main.java:462) > at > org.apache.pig.Main.main(Main.java:107) > Caused by: > java.net.SocketTimeoutException: Call to /XXX.XXX.XXX.XXX :54311 failed on > socket timeout exception: java.net.SocketTimeoutException: 20000 millis > timeout while waiting for channel to be ready for connect. ch : > java.nio.channels.SocketChannel[connection-pending > remote=/XXX.XXX.XXX.XXX :54311] > at > org.apache.hadoop.ipc.Client.wrapException(Client.java:771) > at > org.apache.hadoop.ipc.Client.call(Client.java:743) > at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at > $Proxy0.getProtocolVersion(Unknown > Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) > > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > at > org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) > at > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) > > ... 9 more > Caused by: java.net.SocketTimeoutException: 20000 millis > timeout while waiting for channel to be ready for connect. ch : > java.nio.channels.SocketChannel[connection-pending > remote=/XXX.XXX.XXX.XXX :54311] > at > org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
-
Re: Pig accessing remote hadoop cluster
Iman E 2012-03-28, 17:48
Yes, I am able to ping the node with the jobtracker and namenode running on it. ________________________________ From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Sent: Wednesday, March 28, 2012 1:24 PM Subject: Re: Pig accessing remote hadoop cluster Are you able to ping Namenode from your localhost?
-Prashant
On Mar 28, 2012, at 10:19 AM, Iman E <[EMAIL PROTECTED]> wrote:
> Hello all, > I used to run pig on the same node where the hadoop job tracker is running and everything was fine. Now I am trying to run pig on my laptop to access the cluster where hadoop is running but this alternative fails. > I am running pig 0.8. I have copied the hadoop configuration directory to my local machine and have pointed pig to use its configuration files. Yet, pig fails to establish the connection with the remote hadoop jobtracker. Any suggestions of what should i do to fix this error and get it connect to the remote jobtracker? > > > The error returned is as follows: > > 2012-03-28 10:00:25,614 [main] INFO org.apache.pig.Main - Logging error messages to: > /home/xxx/pig_1332943225604.log > 2012-03-28 10:00:25,809 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > Connecting to hadoop file system at: hdfs://XXX.XXX.XXX.XXX:54311 > 2012-03-28 > 10:00:46,922 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 0 time(s). > 2012-03-28 > 10:01:07,935 [main] INFO org.apache.hadoop.ipc.Client - Retrying > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 1 time(s). > > > > The error appearing in the log files is as follows: > > Error before Pig is launched > ---------------------------- > ERROR > 2999: Unexpected internal error. Failed to create DataStorage > > java.lang.RuntimeException: > Failed to create DataStorage > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > at > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213) > > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133) > > at org.apache.pig.impl.PigContext.connect(PigContext.java:183) > > at > org.apache.pig.PigServer.<init>(PigServer.java:233) > at > org.apache.pig.PigServer.<init>(PigServer.java:222) > at > org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55) > at > org.apache.pig.Main.run(Main.java:462) > at > org.apache.pig.Main.main(Main.java:107) > Caused by: > java.net.SocketTimeoutException: Call to /XXX.XXX.XXX.XXX :54311 failed on > socket timeout exception: java.net.SocketTimeoutException: 20000 millis > timeout while waiting for channel to be ready for connect. ch : > java.nio.channels.SocketChannel[connection-pending > remote=/XXX.XXX.XXX.XXX :54311] > at > org.apache.hadoop.ipc.Client.wrapException(Client.java:771) > at > org.apache.hadoop.ipc.Client.call(Client.java:743) > at > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at > $Proxy0.getProtocolVersion(Unknown > Source) > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > at > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) > > at > org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) > > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) > > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) > > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) > at > org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) > at
-
Re: Pig accessing remote hadoop cluster
Norbert Burger 2012-03-28, 19:07
Are you able to connect on tcp/54311 to that machine? The higher-numbered ports used by Hadoop/HBase are often blocked by firewalls.
Try using netcat -- "nc NAMENODEHOST 54311". Success would be indicated by a hanging connection that's waiting for input.
Norbert
On Wed, Mar 28, 2012 at 1:48 PM, Iman E <[EMAIL PROTECTED]> wrote:
> Yes, I am able to ping the node with the jobtracker and namenode running > on it. > > > ________________________________ > From: Prashant Kommireddi <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Sent: Wednesday, March 28, 2012 1:24 PM > Subject: Re: Pig accessing remote hadoop cluster > > Are you able to ping Namenode from your localhost? > > -Prashant > > On Mar 28, 2012, at 10:19 AM, Iman E <[EMAIL PROTECTED]> wrote: > > > Hello all, > > I used to run pig on the same node where the hadoop job tracker is > running and everything was fine. Now I am trying to run pig on my laptop to > access the cluster where hadoop is running but this alternative fails. > > I am running pig 0.8. I have copied the hadoop configuration directory > to my local machine and have pointed pig to use its configuration files. > Yet, pig fails to establish the connection with the remote hadoop > jobtracker. Any suggestions of what should i do to fix this error and get > it connect to the remote jobtracker? > > > > > > The error returned is as follows: > > > > 2012-03-28 10:00:25,614 [main] INFO org.apache.pig.Main - Logging error > messages to: > > /home/xxx/pig_1332943225604.log > > 2012-03-28 10:00:25,809 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > Connecting to hadoop file system at: hdfs://XXX.XXX.XXX.XXX:54311 > > 2012-03-28 > > 10:00:46,922 [main] INFO org.apache.hadoop.ipc.Client - Retrying > > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 0 time(s). > > 2012-03-28 > > 10:01:07,935 [main] INFO org.apache.hadoop.ipc.Client - Retrying > > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 1 time(s). > > > > > > > > The error appearing in the log files is as follows: > > > > Error before Pig is launched > > ---------------------------- > > ERROR > > 2999: Unexpected internal error. Failed to create DataStorage > > > > java.lang.RuntimeException: > > Failed to create DataStorage > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213) > > > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133) > > > > at org.apache.pig.impl.PigContext.connect(PigContext.java:183) > > > > at > > org.apache.pig.PigServer.<init>(PigServer.java:233) > > at > > org.apache.pig.PigServer.<init>(PigServer.java:222) > > at > > org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55) > > at > > org.apache.pig.Main.run(Main.java:462) > > at > > org.apache.pig.Main.main(Main.java:107) > > Caused by: > > java.net.SocketTimeoutException: Call to /XXX.XXX.XXX.XXX :54311 failed > on > > socket timeout exception: java.net.SocketTimeoutException: 20000 millis > > timeout while waiting for channel to be ready for connect. ch : > > java.nio.channels.SocketChannel[connection-pending > > remote=/XXX.XXX.XXX.XXX :54311] > > at > > org.apache.hadoop.ipc.Client.wrapException(Client.java:771) > > at > > org.apache.hadoop.ipc.Client.call(Client.java:743) > > at > > org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > > at > > $Proxy0.getProtocolVersion(Unknown > > Source) > > at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) > > > > at > > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) > > > > at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
-
Re: Pig accessing remote hadoop cluster
Iman E 2012-03-28, 19:24
Thanks Norbert. Yes, it fails to connect through this port number. I will try to find a solution for the firewall blocking these ports. ________________________________ From: Norbert Burger <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; Iman E <[EMAIL PROTECTED]> Sent: Wednesday, March 28, 2012 3:07 PM Subject: Re: Pig accessing remote hadoop cluster Are you able to connect on tcp/54311 to that machine? The higher-numbered ports used by Hadoop/HBase are often blocked by firewalls.
Try using netcat -- "nc NAMENODEHOST 54311". Success would be indicated by a hanging connection that's waiting for input.
Norbert
On Wed, Mar 28, 2012 at 1:48 PM, Iman E <[EMAIL PROTECTED]> wrote:
> Yes, I am able to ping the node with the jobtracker and namenode running > on it. > > > ________________________________ > From: Prashant Kommireddi <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Sent: Wednesday, March 28, 2012 1:24 PM > Subject: Re: Pig accessing remote hadoop cluster > > Are you able to ping Namenode from your localhost? > > -Prashant > > On Mar 28, 2012, at 10:19 AM, Iman E <[EMAIL PROTECTED]> wrote: > > > Hello all, > > I used to run pig on the same node where the hadoop job tracker is > running and everything was fine. Now I am trying to run pig on my laptop to > access the cluster where hadoop is running but this alternative fails. > > I am running pig 0.8. I have copied the hadoop configuration directory > to my local machine and have pointed pig to use its configuration files. > Yet, pig fails to establish the connection with the remote hadoop > jobtracker. Any suggestions of what should i do to fix this error and get > it connect to the remote jobtracker? > > > > > > The error returned is as follows: > > > > 2012-03-28 10:00:25,614 [main] INFO org.apache.pig.Main - Logging error > messages to: > > /home/xxx/pig_1332943225604.log > > 2012-03-28 10:00:25,809 [main] INFO > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > > Connecting to hadoop file system at: hdfs://XXX.XXX.XXX.XXX:54311 > > 2012-03-28 > > 10:00:46,922 [main] INFO org.apache.hadoop.ipc.Client - Retrying > > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 0 time(s). > > 2012-03-28 > > 10:01:07,935 [main] INFO org.apache.hadoop.ipc.Client - Retrying > > connect to server: /XXX.XXX.XXX.XXX :54311. Already tried 1 time(s). > > > > > > > > The error appearing in the log files is as follows: > > > > Error before Pig is launched > > ---------------------------- > > ERROR > > 2999: Unexpected internal error. Failed to create DataStorage > > > > java.lang.RuntimeException: > > Failed to create DataStorage > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) > > > > at > > > org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) > > > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:213) > > > > at > > > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:133) > > > > at org.apache.pig.impl.PigContext.connect(PigContext.java:183) > > > > at > > org.apache.pig.PigServer.<init>(PigServer.java:233) > > at > > org.apache.pig.PigServer.<init>(PigServer.java:222) > > at > > org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:55) > > at > > org.apache.pig.Main.run(Main.java:462) > > at > > org.apache.pig.Main.main(Main.java:107) > > Caused by: > > java.net.SocketTimeoutException: Call to /XXX.XXX.XXX.XXX :54311 failed > on > > socket timeout exception: java.net.SocketTimeoutException: 20000 millis > > timeout while waiting for channel to be ready for connect. ch : > > java.nio.channels.SocketChannel[connection-pending > > remote=/XXX.XXX.XXX.XXX :54311] > > at > > org.apache.hadoop.ipc.Client.wrapException(Client.java:771) > > at > > org.apache.hadoop.ipc.Client.call(Client.java:743)
|
|