Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Failures in the reducers


Copy link to this message
-
Re: Failures in the reducers
David Rosenstrauch 2010-10-13, 15:57
We ran into this recently.  Solution was to bump up the value of the
dfs.datanode.max.xcievers setting.

HTH,

DR

On 10/12/2010 03:53 PM, rakesh kothari wrote:
>
> Hi,
>
> My MR Job is processing gzipped files each around 450 MB and there are 24 of them. File block size is 512 MB.
>
> This job is failing consistently in the reduce phase with the following exception (below). Any ideas how to troubleshoot this ?
>
> Thanks,
> -Rakesh
>
> Datanode logs:
>
>
>
> INFO
> org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments
> left of total size: 408736960 bytes
>
> 2010-10-12
> 07:25:01,020 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink
> 10.185.13.61:50010
>
> 2010-10-12
> 07:25:01,021 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
> blk_-961587459095414398_368580
>
> 2010-10-12
> 07:25:07,206 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink
> 10.185.13.61:50010
>
> 2010-10-12
> 07:25:07,206 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
> blk_-7795697604292519140_368580
>
> 2010-10-12
> 07:27:05,526 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.EOFException
>
> 2010-10-12
> 07:27:05,527 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
> blk_-7687883740524807660_368625
>
> 2010-10-12
> 07:27:11,713 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.EOFException
>
> 2010-10-12
> 07:27:11,713 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
> blk_-5546440551650461919_368626
>
> 2010-10-12
> 07:27:17,898 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.EOFException
>
> 2010-10-12
> 07:27:17,898 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block
> blk_-3894897742813130478_368628
>
> 2010-10-12
> 07:27:24,081 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.EOFException
>
> 2010-10-12
> 07:27:24,081 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8687736970664350304_368652
>
> 2010-10-12
> 07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
> java.io.IOException: Unable to create new block.
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2812)
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
>
>
>
> 2010-10-12
> 07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
> blk_8687736970664350304_368652 bad datanode[0] nodes == null
>
> 2010-10-12
> 07:27:30,186 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block
> locations. Source file
> "/tmp/dartlog-json-serializer/20100929_/_temporary/_attempt_201010082153_0040_r_000000_2/jp/dart-imp-json/2010/09/29/17/part-r-00000.gz"
> - Aborting...
>
> 2010-10-12
> 07:27:30,196 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
>
> java.io.EOFException
>
>
> at java.io.DataInputStream.readByte(DataInputStream.java:250)
>
>
> at org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
>
>
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)
>
>
> at org.apache.hadoop.io.Text.readString(Text.java:400)
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2868)
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2793)
>
>
> at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
>
>
> at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)
>
> 2010-10-12
> 07:27:30,199 INFO org.apache.hadoop.mapred.TaskRunner: Runnning cleanup for the
> task
>
>
>
> Namenode is throwing following exception:
server.datanode.DataXceiver.run(DataXceiver.java:103)        at java.lang.Thread.run(Thread.java:619)2010-10-12 07:27:30,272 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_786696549206331718_368657 src: /10.184.82.24:53457 dest: /10.43.102.69:500102010-10-12 07:27:30,459 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-6729043740571856940_368657 src: /10.185.13.60:41816 dest: /10.43.102.69:500102010-10-12 07:27:30,468 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /10.185.13.61:48770, dest: /10.43.102.69:50010, bytes: 1626784, op: HDFS_WRITE, cliID: DFSClient_attempt_201010082153_0040_r_000000_2, srvID: DS-859924705-10.43.102.69-50010-1271546912162, blockid: blk_9216465415312085861_3686112010-10-12 07:27:30,468 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_9216465415312085861_368611 terminating2010-10-12 07:27:30,755 INFO org.apache.hadoop.hdfs.server.datanode.
DataBlockScanner: Verification succeeded for blk_5680087852988027619_3212442010-10-12 07:27:30,759 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-1637914415591966611_321290
ad.run(Thread.java:619)2010-10-12 07:27:58,976 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.43.102.69:50010, storageID=DS-859924705-10.43.102.69-50010-1271546912162, infoPort=8501, ipcPort=50020):DataXceiverjava.io.IOException: xceiverCount 258 exceeds the limit of concurrent xcievers 256        at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:88)        at java.lang.Thread.run(Thread.java:619)