Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Child Error


Hi Jim,

Will you be able to do the same test with Oracle JDK 1.6 instead of OpenJDK
1.7 to see if it maked a difference?

JM

2013/5/25 Jim Twensky <[EMAIL PROTECTED]>

> Hi Jean, thanks for replying. I'm using java 1.7.0_21 on ubuntu. Here is
> the output:
>
> $ java -version
> java version "1.7.0_21"
> OpenJDK Runtime Environment (IcedTea 2.3.9) (7u21-2.3.9-0ubuntu0.12.10.1)
> OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
>
> I don't get any OOME errors and this error happens on random nodes, not a
> particular one. Usually all tasks running on a particular node fail and
> that node gets blacklisted. However, the same node works just fine during
> the next or previous jobs. Can it be a problem with the ssh keys? What else
> can cause the IOException with "failure to login" message? I've been
> digging into this for two days but I'm almost clueless.
>
> Thanks,
> Jim
>
>
>
>
> On Fri, May 24, 2013 at 10:32 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi Jim,
>>
>> Which JVM are you using?
>>
>> I don't think you have any memory issue. Else you will have got some
>> OOME...
>>
>> JM
>>
>>
>> 2013/5/24 Jim Twensky <[EMAIL PROTECTED]>
>>
>>> Hi again, in addition to my previous post, I was able to get some error
>>> logs from the task tracker/data node this morning and looks like it might
>>> be a jetty issue:
>>>
>>> 2013-05-23 19:59:20,595 WARN org.apache.hadoop.mapred.TaskLog: Failed to
>>> retrieve stdout log for task: attempt_201305231647_0007_m_001096_0
>>> java.io.IOException: Owner 'jim' for path
>>> /var/tmp/jim/hadoop-logs/userlogs/job_201305231647_0007/attempt_201305231647_0007_m_001096_0/stdout
>>> did not match expected owner '10929'
>>>   at org.apache.hadoop.io.SecureIOUtils.checkStat(SecureIOUtils.java:177)
>>>   at
>>> org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:117)
>>>   at org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:455)
>>>   at
>>> org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java:81)
>>>   at
>>> org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296)
>>>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
>>>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
>>>   at
>>> org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:848)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>>>   at
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>>>
>>>
>>> I am wondering if I am hitting MAPREDUCE-2389<https://issues.apache.org/jira/browse/MAPREDUCE-2389>If so, how do I downgrade my jetty version? Should I just replace the jetty
>>> jar file in the lib directory with an earlier version and restart my
>>> cluster?
>>>
>>> Thank you.
>>>
>>>
>>>
>>>
>>> On Thu, May 23, 2013 at 7:14 PM, Jim Twensky <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hello, I have a 20 node Hadoop cluster where each node has 8GB memory
>>>> and an 8-core processor. I sometimes get the following error on a random
>>>> basis:
>>>>
>>>>
>>>>
>>>> -----------------------------------------------------------------------------------------------------------
>>>>
>>>> Exception in thread "main" java.io.IOException: Exception reading file:/var/tmp/jim/hadoop-jim/mapred/local/taskTracker/jim/jobcache/job_201305231647_0005/jobToken
>>>> at org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:135)
>>>> at org.apache.hadoop.mapreduce.security.TokenCache.loadTokens(TokenCache.java:165)
>>>> at org.apache.hadoop.mapred.Child.main(Child.java:92)
>>>> Caused by: java.io.IOException: failure to login
>>>> at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:501)
>>>> at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:463)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB