Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: MapReduce job failing when a node of cluster is rebooted


Copy link to this message
-
Re: MapReduce job failing when a node of cluster is rebooted
Hi,

did you set the hdfs-related dirs outside of /tmp? Most *ux systems
clean them up on reboot.

- Alex

On Tue, Dec 27, 2011 at 2:09 PM, Rajat Goel <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a 7-node setup (1 - Namenode/JobTracker, 6 - Datanodes/TaskTrackers)
> running Hadoop version 0.20.203.
>
> I performed the following test:
> Initially cluster is running smoothly. Just before launching a MapReduce
> job (about one or two minutes before), I shutdown one of the data nodes
> (rebooted the machine). Then my MapReduce job starts but immediately fails
> with following messages on stderr:
>
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> NOTICE: Configuration: /device.map    /region.map    /url.map
> /data/output/2011/12/26/08
>  PS:192.168.100.206:11111    3600    true    Notice
> 11/12/26 09:10:26 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 11/12/26 09:10:26 INFO input.FileInputFormat: Total input paths to process
> : 24
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Abandoning block
> blk_-6309642664478517067_35619
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Waiting to find target node:
> 192.168.100.7:50010
> 11/12/26 09:10:44 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 11/12/26 09:10:44 INFO hdfs.DFSClient: Abandoning block
> blk_4129088682008611797_35619
> 11/12/26 09:10:53 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:10:53 INFO hdfs.DFSClient: Abandoning block
> blk_3596375242483863157_35619
> 11/12/26 09:11:01 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:11:01 INFO hdfs.DFSClient: Abandoning block
> blk_724369205729364853_35619
> 11/12/26 09:11:07 WARN hdfs.DFSClient: DataStreamer Exception:
> java.io.IOException: Unable to create new block.
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3002)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
>
> 11/12/26 09:11:07 WARN hdfs.DFSClient: Error Recovery for block
> blk_724369205729364853_35619 bad datanode[1] nodes == null
> 11/12/26 09:11:07 WARN hdfs.DFSClient: Could not get block locations.
> Source file
> "/data/hadoop-admin/mapred/staging/admin/.staging/job_201112200923_0292/job.split"
> - Aborting...
> 11/12/26 09:11:07 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://machine-100-205:9000/data/hadoop-admin/mapred/staging/admin/.staging/job_201112200923_0292
> Exception in thread "main" java.io.IOException: Bad connect ack with
> firstBadLink as 192.168.100.5:50010
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3068)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2983)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
>    at

Alexander Lorenz
http://mapredit.blogspot.com

P Think of the environment: please don't print this email unless you
really need to.