Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Child Error


Copy link to this message
-
Re: Child Error
Hi again, in addition to my previous post, I was able to get some error
logs from the task tracker/data node this morning and looks like it might
be a jetty issue:

2013-05-23 19:59:20,595 WARN org.apache.hadoop.mapred.TaskLog: Failed to
retrieve stdout log for task: attempt_201305231647_0007_m_001096_0
java.io.IOException: Owner 'jim' for path
/var/tmp/jim/hadoop-logs/userlogs/job_201305231647_0007/attempt_201305231647_0007_m_001096_0/stdout
did not match expected owner '10929'
  at org.apache.hadoop.io.SecureIOUtils.checkStat(SecureIOUtils.java:177)
  at org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:117)
  at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:455)
  at
org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
  at org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
  at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
  at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
  at
org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:848)
  at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
  at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
I am wondering if I am hitting
MAPREDUCE-2389<https://issues.apache.org/jira/browse/MAPREDUCE-2389>If
so, how do I downgrade my jetty version? Should I just replace the
jetty
jar file in the lib directory with an earlier version and restart my
cluster?

Thank you.
On Thu, May 23, 2013 at 7:14 PM, Jim Twensky <[EMAIL PROTECTED]> wrote:

> Hello, I have a 20 node Hadoop cluster where each node has 8GB memory and
> an 8-core processor. I sometimes get the following error on a random basis:
>
>
>
> -----------------------------------------------------------------------------------------------------------
>
> Exception in thread "main" java.io.IOException: Exception reading file:/var/tmp/jim/hadoop-jim/mapred/local/taskTracker/jim/jobcache/job_201305231647_0005/jobToken
> at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:135)
> at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:165)
> at org.apache.hadoop.mapred.Child.main(Child.java:92)
> Caused by: java.io.IOException: failure to login
> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:501)
> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:463)
> at org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1519)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1420)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:129)
> ... 2 more
> Caused by: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
> at com.sun.security.auth.UnixPrincipal.<init>(UnixPrincipal.java:70)
> at com.sun.security.auth.module.UnixLoginModule.login(UnixLoginModule.java:132)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> ......
>
>
> -----------------------------------------------------------------------------------------------------------
>
> This does not always happen but I see a pattern when the intermediate data
> is larger, it tends to occur more frequently. In the web log, I can see the
> following:
>
> java.lang.Throwable: Child Error
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status of 1.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB