Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> MAX_FETCH_RETRIES_PER_MAP (TaskTracker dying?)


Copy link to this message
-
Re: MAX_FETCH_RETRIES_PER_MAP (TaskTracker dying?)
Hi Chris
         From the stack trace, it looks like a JVM corruption issue. It is
a known issue and have been fixed in CDH3u2, i believe an upgrade would
solve your issues.
https://issues.apache.org/jira/browse/MAPREDUCE-3184

Then regarding your queries,I'd try to help you out a bit.In mapreduce the
data transfer between map and reduce happens over http. If jetty is down
then that won't happen which means map output in one location wont be
accessible to reducer in another location. The map outputs are in LFS and
not on HDFS so even if the data node on the machine is up we can't get the
data in above circumstances.

Hope it helps!..

Regards
Bejoy.K.S
On Tue, Dec 6, 2011 at 2:15 AM, Chris Curtin <[EMAIL PROTECTED]> wrote:

> Hi,
>
> Using: *Version:* 0.20.2-cdh3u0, r81256ad0f2e4ab2bd34b04f53d25a6c23686dd14,
> 8 node cluster, 64 bit Centos
>
> We are occasionally seeing MAX_FETCH_RETRIES_PER_MAP errors on reducer
> jobs. When we investigate it looks like the TaskTracker on the node being
> fetched from is not running. Looking at the logs we see what looks like a
> self-initiated shutdown:
>
> 2011-12-05 14:10:48,632 INFO org.apache.hadoop.mapred.JvmManager: JVM :
> jvm_201112050908_0222_r_1100711673 exited with exit code 0. Number of tasks
> it ran: 0
> 2011-12-05 14:10:48,632 ERROR org.apache.hadoop.mapred.JvmManager: Caught
> Throwable in JVMRunner. Aborting TaskTracker.
> java.lang.NullPointerException
>        at
>
> org.apache.hadoop.mapred.DefaultTaskController.logShExecStatus(DefaultTaskController.java:145)
>        at
>
> org.apache.hadoop.mapred.DefaultTaskController.launchTask(DefaultTaskController.java:129)
>        at
>
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.runChild(JvmManager.java:472)
>        at
>
> org.apache.hadoop.mapred.JvmManager$JvmManagerForType$JvmRunner.run(JvmManager.java:446)
> 2011-12-05 14:10:48,634 INFO org.apache.hadoop.mapred.TaskTracker:
> SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down TaskTracker at had11.atlis1/10.120.41.118
> ************************************************************/
>
> Then the reducers have the following:
>
>
> 2011-12-05 14:12:00,962 WARN org.apache.hadoop.mapred.ReduceTask:
> java.net.ConnectException: Connection refused
>  at java.net.PlainSocketImpl.socketConnect(Native Method)
>  at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>  at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
>  at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>  at java.net.Socket.connect(Socket.java:529)
>  at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
>  at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
>  at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
>  at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
>  at sun.net.www.http.HttpClient.New(HttpClient.java:306)
>  at sun.net.www.http.HttpClient.New(HttpClient.java:323)
>  at
>
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
>  at
>
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
>  at
>
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
>  at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1525)
>  at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.setupSecureConnection(ReduceTask.java:1482)
>  at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1390)
>  at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1301)
>  at
>
> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1233)
>
> 2011-12-05 14:12:00,962 INFO org.apache.hadoop.mapred.ReduceTask: Task
> attempt_201112050908_0169_r_000005_0: Failed fetch #2 from
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB