|
|
-
Hadoop map task deadlocking?Robert Dyer 2012-07-03, 20:44
I am running Hadoop 1.0.3 on a small cluster (1 namenode, 1
jobtracker, 2 compute+data nodes). My input file is a SequenceFile of around 129MB consisting of Text keys and BytesWritable values. This job creates 2 map tasks. The first runs to completion and exits without any error. The second seems to be stuck in the initializing state. If I leave it, it will never finish, time out on, error nothing. It just runs forever and the job will never complete (in any state!). I must manually kill the job on the cluster. Any ideas? Attaching full Java thread dump of the 'stuck' map task: ============================================== 2012-06-28 16:35:23 Full thread dump OpenJDK 64-Bit Server VM (22.0-b10 mixed mode): "SpillThread" daemon prio=10 tid=0x00007f1b90738800 nid=0x5445 waiting on condition [0x00007f1b87080000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f9cbdbc0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1340) "communication thread" daemon prio=10 tid=0x00007f1b906bd000 nid=0x543c runnable [0x00007f1b8717f000] java.lang.Thread.State: RUNNABLE at java.util.Arrays.copyOf(Arrays.java:2367) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:415) at java.lang.StringBuilder.append(StringBuilder.java:132) at java.net.URLStreamHandler.parseURL(URLStreamHandler.java:249) at sun.net.www.protocol.file.Handler.parseURL(Handler.java:67) at java.net.URL.<init>(URL.java:612) at java.net.URL.<init>(URL.java:480) at sun.misc.URLClassPath$FileLoader.getResource(URLClassPath.java:1035) at sun.misc.URLClassPath$FileLoader.findResource(URLClassPath.java:1024) at sun.misc.URLClassPath.findResource(URLClassPath.java:172) at java.net.URLClassLoader$2.run(URLClassLoader.java:549) at java.net.URLClassLoader$2.run(URLClassLoader.java:547) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findResource(URLClassLoader.java:546) at java.lang.ClassLoader.getResource(ClassLoader.java:1134) at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:227) at java.util.ResourceBundle$Control$1.run(ResourceBundle.java:2600) at java.util.ResourceBundle$Control$1.run(ResourceBundle.java:2585) at java.security.AccessController.doPrivileged(Native Method) at java.util.ResourceBundle$Control.newBundle(ResourceBundle.java:2584) at java.util.ResourceBundle.loadBundle(ResourceBundle.java:1436) at java.util.ResourceBundle.findBundle(ResourceBundle.java:1400) at java.util.ResourceBundle.findBundle(ResourceBundle.java:1354) at java.util.ResourceBundle.findBundle(ResourceBundle.java:1354) at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1296) at java.util.ResourceBundle.getBundle(ResourceBundle.java:724) at org.apache.hadoop.mapred.Counters.getResourceBundle(Counters.java:385) at org.apache.hadoop.mapred.Counters.access$100(Counters.java:51) at org.apache.hadoop.mapred.Counters$Group.<init>(Counters.java:166) at org.apache.hadoop.mapred.Counters.getGroup(Counters.java:414) - locked <0x00000000f9d3abd0> (a org.apache.hadoop.mapred.Counters) at org.apache.hadoop.mapred.Counters.findCounter(Counters.java:445) - locked <0x00000000f9d3abd0> (a org.apache.hadoop.mapred.Counters) at org.apache.hadoop.mapred.Task$FileSystemStatisticUpdater.updateCounters(Task.java:775) at org.apache.hadoop.mapred.Task.updateCounters(Task.java:827) - locked <0x00000000f9d22e68> (a org.apache.hadoop.mapred.MapTask) at org.apache.hadoop.mapred.Task.access$600(Task.java:66) at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:666) at java.lang.Thread.run(Thread.java:722) "Timer for 'MapTask' metrics system" daemon prio=10 tid=0x00007f1b9066e800 nid=0x543a in Object.wait() [0x00007f1b875ad000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000f9d02bd0> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:552) - locked <0x00000000f9d02bd0> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:505) "Thread for syncLogs" daemon prio=10 tid=0x00007f1b904d3800 nid=0x5439 waiting for monitor entry [0x00007f1b878b3000] java.lang.Thread.State: BLOCKED (on object monitor) at java.util.zip.ZipCoder.getBytes(ZipCoder.java:80) at java.util.zip.ZipFile.getEntry(ZipFile.java:302) - locked <0x00000000f9d2a040> (a java.util.jar.JarFile) at java.util.jar.JarFile.getEntry(JarFile.java:225) at java.util.jar.JarFile.getJarEntry(JarFile.java:208) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:817) at sun.misc.URLClassPath$JarLoader.findResource(URLClassPath.java:795) at sun.misc.URLClassPath.findResource(URLClassPath.java:172) at java.net.URLClassLoader$2.run(URLClassLoader.java:549) at java.net.URLClassLoader$2.run(URLClassLoader.java:547) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findResource(URLClassLoader.java:546) at java.lang.ClassLoader.getResource(ClassLoader.java:1134) at java.lang.ClassLoader.getResource(ClassLoader.java:1129) at java.lang.ClassLoader.getSystemResource(ClassLoader.java:1256) at java.lang.ClassLoader.getSystemResourceAsStream(ClassLoader.java:1359) at java.lang.Class.getResourceAsStream(Class.java:2045) at javax.xml.parsers.SecuritySupport$4.run(SecuritySupport.java:92) at java.security.AccessController.doPrivileged(Native Method) at javax.xml.parsers.SecuritySupport.getResourceAsStream(Secu |