Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Unexpected Hadoop behavior: map task re-running after reducer has been running


Copy link to this message
-
Re: Unexpected Hadoop behavior: map task re-running after reducer has been running
David Parks 2013-03-11, 08:30
I should have included the obvious log files here.
Task attempt_201303080219_0002_r_005846_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005857_0 failed to report status for 7203 seconds. Killing!
Task attempt_201303080219_0002_r_005858_0 failed to report status for 7200 seconds. Killing!
Map output lost, rescheduling: getMapOutput(attempt_201303080219_0002_m_006433_0,6169) failed :
java.io.IOException: Error Reading IndexFile
        at org.apache.hadoop.mapred.IndexCache.readIndexFileToCache(IndexCache.java:113)
        at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:66)
        at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3998)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:842)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:155)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:73)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:54)
        at org.apache.hadoop.mapred.IndexCache.readIndexFileToCache(IndexCache.java:109)
        ... 23 more

Task attempt_201303080219_0002_r_005908_0 failed to report status for 7200 seconds. Killing!
Task attempt_201303080219_0002_r_005962_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005971_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005980_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005983_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005988_0 failed to report status for 7200 seconds. Killing!
attempt_201303080219_0002_r_005988_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_005986_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005992_0 failed to report status for 7202 seconds. Killing!
attempt_201303080219_0002_r_005992_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_005993_0 failed to report status for 7200 seconds. Killing!
Task attempt_201303080219_0002_r_006004_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005999_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_005999_0: [Fatal Error] :1:1: Premature end of file.
Too many fetch-failures
Task attempt_201303080219_0002_r_006005_0 failed to report status for 7202 seconds. Killing!
attempt_201303080219_0002_r_006005_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_006007_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_006007_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_006010_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_006013_0 failed to report status for 7201 seconds. Killing!
Too many fetch-failures
Too many fetch-failures
Task attempt_201303080219_0002_r_006026_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_006026_0: [Fatal Error] :1:1: Premature end of file.
Too many fetch-failures
Too many fetch-failures
Too many fetch-failures
________________________________
 From: David Parks <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Monday, March 11, 2013 3:23 PM
Subject: Unexpected Hadoop behavior: map task re-running after reducer has been running
 

I can’t explain this behavior, can someone help me here:
 
  Kind  % Complete Num Tasks Pending Running Complete Killed Failed/Killed Task Attempts
    map      100.00%    23547       0       1    23546      0       247 / 0
   reduce     62.40%    10000    3738      30     6232      0       336 / 0
 
This job has been running for 48 hours and the reducer is quite a ways through its processing. But we’ve hit a snag (I’m not sure what snag exactly).
 
The map tasks were 100% complete, none running, but now I see 1 map task running. In a few minutes that map task will finish and I’ll see “Running map tasks” change to 0, the # of failed map tasks will increment by 1, and the map task will run again a short time thereafter. This seems p