Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Re: Child Error


Copy link to this message
-
Re: Child Error
Jean-Marc Spaggiari 2013-05-25, 17:14
Hi Jim,

Will you be able to do the same test with Oracle JDK 1.6 instead of OpenJDK
1.7 to see if it maked a difference?

JM

2013/5/25 Jim Twensky <[EMAIL PROTECTED]>

> Hi Jean, thanks for replying. I'm using java 1.7.0_21 on ubuntu. Here is
> the output:
>
> $ java -version
> java version "1.7.0_21"
> OpenJDK Runtime Environment (IcedTea 2.3.9) (7u21-2.3.9-0ubuntu0.12.10.1)
> OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
>
> I don't get any OOME errors and this error happens on random nodes, not a
> particular one. Usually all tasks running on a particular node fail and
> that node gets blacklisted. However, the same node works just fine during
> the next or previous jobs. Can it be a problem with the ssh keys? What else
> can cause the IOException with "failure to login" message? I've been
> digging into this for two days but I'm almost clueless.
>
> Thanks,
> Jim
>
>
>
>
> On Fri, May 24, 2013 at 10:32 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi Jim,
>>
>> Which JVM are you using?
>>
>> I don't think you have any memory issue. Else you will have got some
>> OOME...
>>
>> JM
>>
>>
>> 2013/5/24 Jim Twensky <[EMAIL PROTECTED]>
>>
>>> Hi again, in addition to my previous post, I was able to get some error
>>> logs from the task tracker/data node this morning and looks like it might
>>> be a jetty issue:
>>>
>>> 2013-05-23 19:59:20,595 WARN org.apache.hadoop.mapred.TaskLog: Failed to
>>> retrieve stdout log for task: attempt_201305231647_0007_m_001096_0
>>> java.io.IOException: Owner 'jim' for path
>>> /var/tmp/jim/hadoop-logs/userlogs/job_201305231647_0007/attempt_201305231647_0007_m_001096_0/stdout
>>> did not match expected owner '10929'
>>>   at org.apache.hadoop.io.SecureIOUtils.checkStat(SecureIOUtils.java:177)
>>>   at
>>> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:117)
>>>   at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:455)
>>>   at
>>> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
>>>   at
>>> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
>>>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>>>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>>>   at
>>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:848)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>>>
>>>
>>> I am wondering if I am hitting MAPREDUCE-2389<https://issues.apache.org/jira/browse/MAPREDUCE-2389>If so, how do I downgrade my jetty version? Should I just replace the jetty
>>> jar file in the lib directory with an earlier version and restart my
>>> cluster?
>>>
>>> Thank you.
>>>
>>>
>>>
>>>
>>> On Thu, May 23, 2013 at 7:14 PM, Jim Twensky <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hello, I have a 20 node Hadoop cluster where each node has 8GB memory
>>>> and an 8-core processor. I sometimes get the following error on a random
>>>> basis:
>>>>
>>>>
>>>>
>>>> -----------------------------------------------------------------------------------------------------------
>>>>
>>>> Exception in thread "main" java.io.IOException: Exception reading file:/var/tmp/jim/hadoop-jim/mapred/local/taskTracker/jim/jobcache/job_201305231647_0005/jobToken
>>>> at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:135)
>>>> at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:165)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:92)
>>>> Caused by: java.io.IOException: failure to login
>>>> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:501)
>>>> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:463)