|
|
-
Job does not run with EOFException
Caetano Sauer 2012-08-28, 13:45
Hello,
I am getting the following error when trying to execute a hadoop job on a 5-node cluster:
Caused by: java.io.IOException: Call to *** failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) ... 9 more Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
(My jobtracker host was substituted by ***)
After 3 hours of searching, everything points to an incompatibility between the hadoop versions of the client and the server, but this is not the case, since I can run the job on a pseudo-distributed setup on a different machine. Both are running the exact same version (same svn revision and source checksum).
Does anyone have a solution or a suggestion on how to find more debug information?
Thank you in advance, Caetano Sauer
+
Caetano Sauer 2012-08-28, 13:45
-
Re: Job does not run with EOFException
Harsh J 2012-08-28, 13:47
Are you sure you're reaching the right port for your JobTrcker?
On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <[EMAIL PROTECTED]> wrote: > Hello, > > I am getting the following error when trying to execute a hadoop job on a > 5-node cluster: > > Caused by: java.io.IOException: Call to *** failed on local exception: > java.io.EOFException > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) > ... 9 more > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:375) > at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) > > (My jobtracker host was substituted by ***) > > After 3 hours of searching, everything points to an incompatibility between > the hadoop versions of the client and the server, but this is not the case, > since I can run the job on a pseudo-distributed setup on a different > machine. Both are running the exact same version (same svn revision and > source checksum). > > Does anyone have a solution or a suggestion on how to find more debug > information? > > Thank you in advance, > Caetano Sauer
-- Harsh J
+
Harsh J 2012-08-28, 13:47
-
Re: Job does not run with EOFException
Caetano Sauer 2012-08-28, 13:53
The host on top of the stack trace contains the host and port I defined on mapred.job.tracker in mapred-site.xml
Other than that, I don't know how to verify what you asked me. Any tips?
On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Are you sure you're reaching the right port for your JobTrcker? > > On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <[EMAIL PROTECTED]> > wrote: > > Hello, > > > > I am getting the following error when trying to execute a hadoop job on a > > 5-node cluster: > > > > Caused by: java.io.IOException: Call to *** failed on local exception: > > java.io.EOFException > > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) > > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) > > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) > > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) > > ... 9 more > > Caused by: java.io.EOFException > > at java.io.DataInputStream.readInt(DataInputStream.java:375) > > at > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) > > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) > > > > (My jobtracker host was substituted by ***) > > > > After 3 hours of searching, everything points to an incompatibility > between > > the hadoop versions of the client and the server, but this is not the > case, > > since I can run the job on a pseudo-distributed setup on a different > > machine. Both are running the exact same version (same svn revision and > > source checksum). > > > > Does anyone have a solution or a suggestion on how to find more debug > > information? > > > > Thank you in advance, > > Caetano Sauer > > > > -- > Harsh J >
+
Caetano Sauer 2012-08-28, 13:53
-
Re: Job does not run with EOFException
Hemanth Yamijala 2012-08-29, 08:08
Are you able to browse the web UI for the jobtracker. If not configured separately, it should be at hostname:50030 ? It would also help if you can telnet to the jobtracker server port and see if it is able to connect.
Thanks hemanth
On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <[EMAIL PROTECTED]> wrote: > The host on top of the stack trace contains the host and port I defined on > mapred.job.tracker in mapred-site.xml > > Other than that, I don't know how to verify what you asked me. Any tips? > > > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <[EMAIL PROTECTED]> wrote: >> >> Are you sure you're reaching the right port for your JobTrcker? >> >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <[EMAIL PROTECTED]> >> wrote: >> > Hello, >> > >> > I am getting the following error when trying to execute a hadoop job on >> > a >> > 5-node cluster: >> > >> > Caused by: java.io.IOException: Call to *** failed on local exception: >> > java.io.EOFException >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071) >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:396) >> > at >> > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) >> > at >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >> > ... 9 more >> > Caused by: java.io.EOFException >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) >> > at >> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) >> > >> > (My jobtracker host was substituted by ***) >> > >> > After 3 hours of searching, everything points to an incompatibility >> > between >> > the hadoop versions of the client and the server, but this is not the >> > case, >> > since I can run the job on a pseudo-distributed setup on a different >> > machine. Both are running the exact same version (same svn revision and >> > source checksum). >> > >> > Does anyone have a solution or a suggestion on how to find more debug >> > information? >> > >> > Thank you in advance, >> > Caetano Sauer >> >> >> >> -- >> Harsh J > >
+
Hemanth Yamijala 2012-08-29, 08:08
-
Re: Job does not run with EOFException
Caetano Sauer 2012-08-29, 08:22
I am able to browse the web UI and telnet/netcat the tasktracker host and port, so the connection is being established. Is there any way I can confirm whether it is really some kind of version conflict? The EOF when doing readInt() seems like a protocol incompatibility.
By the way, the tastracker is killed every time this happens, and I am left with some kind of JVM dump in a hs_err_*.log file. The tasktracker logs show nothing.
Some facts that may help find the problem are: 1) I am not running with a "hadoop" user as it is usually suggested in tutorials 2) There is an older version of hadoop which I am absolutely sure is not running, and even so, it is configured on different ports.
Thank you for your help and regards, Caetano Sauer
On Wed, Aug 29, 2012 at 10:08 AM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
> Are you able to browse the web UI for the jobtracker. If not > configured separately, it should be at hostname:50030 ? It would also > help if you can telnet to the jobtracker server port and see if it is > able to connect. > > Thanks > hemanth > > On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <[EMAIL PROTECTED]> > wrote: > > The host on top of the stack trace contains the host and port I defined > on > > mapred.job.tracker in mapred-site.xml > > > > Other than that, I don't know how to verify what you asked me. Any tips? > > > > > > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> > >> Are you sure you're reaching the right port for your JobTrcker? > >> > >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <[EMAIL PROTECTED]> > >> wrote: > >> > Hello, > >> > > >> > I am getting the following error when trying to execute a hadoop job > on > >> > a > >> > 5-node cluster: > >> > > >> > Caused by: java.io.IOException: Call to *** failed on local exception: > >> > java.io.EOFException > >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) > >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) > >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) > >> > at java.security.AccessController.doPrivileged(Native Method) > >> > at javax.security.auth.Subject.doAs(Subject.java:396) > >> > at > >> > > >> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > >> > at > >> > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) > >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) > >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) > >> > ... 9 more > >> > Caused by: java.io.EOFException > >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) > >> > at > >> > > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) > >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) > >> > > >> > (My jobtracker host was substituted by ***) > >> > > >> > After 3 hours of searching, everything points to an incompatibility > >> > between > >> > the hadoop versions of the client and the server, but this is not the > >> > case, > >> > since I can run the job on a pseudo-distributed setup on a different > >> > machine. Both are running the exact same version (same svn revision > and > >> > source checksum). > >> > > >> > Does anyone have a solution or a suggestion on how to find more debug > >> > information? > >> > > >> > Thank you in advance, > >> > Caetano Sauer > >> > >> > >> > >> -- > >> Harsh J > > > > >
+
Caetano Sauer 2012-08-29, 08:22
-
Re: Job does not run with EOFException
Hemanth Yamijala 2012-08-29, 09:18
Maybe you have already done this, in which please ignore the suggestion. Can you run hadoop version using the hadoop script from the same location which is running the JobTracker and the client program ?
Thanks heamnth
On Wed, Aug 29, 2012 at 1:52 PM, Caetano Sauer <[EMAIL PROTECTED]> wrote: > I am able to browse the web UI and telnet/netcat the tasktracker host and > port, so the connection is being established. Is there any way I can confirm > whether it is really some kind of version conflict? The EOF when doing > readInt() seems like a protocol incompatibility. > > By the way, the tastracker is killed every time this happens, and I am left > with some kind of JVM dump in a hs_err_*.log file. The tasktracker logs show > nothing. > > Some facts that may help find the problem are: > 1) I am not running with a "hadoop" user as it is usually suggested in > tutorials > 2) There is an older version of hadoop which I am absolutely sure is not > running, and even so, it is configured on different ports. > > Thank you for your help and regards, > Caetano Sauer > > > On Wed, Aug 29, 2012 at 10:08 AM, Hemanth Yamijala <[EMAIL PROTECTED]> > wrote: >> >> Are you able to browse the web UI for the jobtracker. If not >> configured separately, it should be at hostname:50030 ? It would also >> help if you can telnet to the jobtracker server port and see if it is >> able to connect. >> >> Thanks >> hemanth >> >> On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <[EMAIL PROTECTED]> >> wrote: >> > The host on top of the stack trace contains the host and port I defined >> > on >> > mapred.job.tracker in mapred-site.xml >> > >> > Other than that, I don't know how to verify what you asked me. Any tips? >> > >> > >> > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <[EMAIL PROTECTED]> wrote: >> >> >> >> Are you sure you're reaching the right port for your JobTrcker? >> >> >> >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer <[EMAIL PROTECTED]> >> >> wrote: >> >> > Hello, >> >> > >> >> > I am getting the following error when trying to execute a hadoop job >> >> > on >> >> > a >> >> > 5-node cluster: >> >> > >> >> > Caused by: java.io.IOException: Call to *** failed on local >> >> > exception: >> >> > java.io.EOFException >> >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) >> >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071) >> >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) >> >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) >> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) >> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >> >> > at java.security.AccessController.doPrivileged(Native Method) >> >> > at javax.security.auth.Subject.doAs(Subject.java:396) >> >> > at >> >> > >> >> > >> >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) >> >> > at >> >> > >> >> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >> >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >> >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >> >> > ... 9 more >> >> > Caused by: java.io.EOFException >> >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) >> >> > at >> >> > >> >> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) >> >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) >> >> > >> >> > (My jobtracker host was substituted by ***) >> >> > >> >> > After 3 hours of searching, everything points to an incompatibility >> >> > between >> >> > the hadoop versions of the client and the server, but this is not the >> >> > case, >> >> > since I can run the job on a pseudo-distributed setup on a different >> >> > machine. Both are running the exact same version (same svn revision >> >> > and >> >> > source checksum). >> >> > >> >> > Does anyone have a solution or a suggestion on how to find more debug >> >> > information?
+
Hemanth Yamijala 2012-08-29, 09:18
-
Re: Job does not run with EOFException
Caetano Sauer 2012-08-29, 09:23
I am submitting the job from the same machine which runs the jobtracker.
Just a correction: On the previous email I meant jobtracker and not tasktracker.
On Wed, Aug 29, 2012 at 11:18 AM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
> Maybe you have already done this, in which please ignore the > suggestion. Can you run hadoop version using the hadoop script from > the same location which is running the JobTracker and the client > program ? > > Thanks > heamnth > > On Wed, Aug 29, 2012 at 1:52 PM, Caetano Sauer <[EMAIL PROTECTED]> > wrote: > > I am able to browse the web UI and telnet/netcat the tasktracker host and > > port, so the connection is being established. Is there any way I can > confirm > > whether it is really some kind of version conflict? The EOF when doing > > readInt() seems like a protocol incompatibility. > > > > By the way, the tastracker is killed every time this happens, and I am > left > > with some kind of JVM dump in a hs_err_*.log file. The tasktracker logs > show > > nothing. > > > > Some facts that may help find the problem are: > > 1) I am not running with a "hadoop" user as it is usually suggested in > > tutorials > > 2) There is an older version of hadoop which I am absolutely sure is not > > running, and even so, it is configured on different ports. > > > > Thank you for your help and regards, > > Caetano Sauer > > > > > > On Wed, Aug 29, 2012 at 10:08 AM, Hemanth Yamijala <[EMAIL PROTECTED]> > > wrote: > >> > >> Are you able to browse the web UI for the jobtracker. If not > >> configured separately, it should be at hostname:50030 ? It would also > >> help if you can telnet to the jobtracker server port and see if it is > >> able to connect. > >> > >> Thanks > >> hemanth > >> > >> On Tue, Aug 28, 2012 at 7:23 PM, Caetano Sauer <[EMAIL PROTECTED]> > >> wrote: > >> > The host on top of the stack trace contains the host and port I > defined > >> > on > >> > mapred.job.tracker in mapred-site.xml > >> > > >> > Other than that, I don't know how to verify what you asked me. Any > tips? > >> > > >> > > >> > On Tue, Aug 28, 2012 at 3:47 PM, Harsh J <[EMAIL PROTECTED]> wrote: > >> >> > >> >> Are you sure you're reaching the right port for your JobTrcker? > >> >> > >> >> On Tue, Aug 28, 2012 at 7:15 PM, Caetano Sauer < > [EMAIL PROTECTED]> > >> >> wrote: > >> >> > Hello, > >> >> > > >> >> > I am getting the following error when trying to execute a hadoop > job > >> >> > on > >> >> > a > >> >> > 5-node cluster: > >> >> > > >> >> > Caused by: java.io.IOException: Call to *** failed on local > >> >> > exception: > >> >> > java.io.EOFException > >> >> > at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103) > >> >> > at org.apache.hadoop.ipc.Client.call(Client.java:1071) > >> >> > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) > >> >> > at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source) > >> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921) > >> >> > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) > >> >> > at java.security.AccessController.doPrivileged(Native Method) > >> >> > at javax.security.auth.Subject.doAs(Subject.java:396) > >> >> > at > >> >> > > >> >> > > >> >> > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) > >> >> > at > >> >> > > >> >> > > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) > >> >> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) > >> >> > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) > >> >> > ... 9 more > >> >> > Caused by: java.io.EOFException > >> >> > at java.io.DataInputStream.readInt(DataInputStream.java:375) > >> >> > at > >> >> > > >> >> > > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800) > >> >> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745) > >> >> > > >> >> > (My jobtracker host was substituted by ***) > >> >> > > >> >> > After 3 hours of searching, everything points to an incompatibility
+
Caetano Sauer 2012-08-29, 09:23
|
|