Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Re: MapReduce job failing when a node of cluster is rebooted


Copy link to this message
-
Re: MapReduce job failing when a node of cluster is rebooted
Hi,

did you set the hdfs-related dirs outside of /tmp? Most *ux systems
clean them up on reboot.

- Alex

On Tue, Dec 27, 2011 at 2:09 PM, Rajat Goel <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I have a 7-node setup (1 - Namenode/JobTracker, 6 - Datanodes/TaskTrackers)
> running Hadoop version 0.20.203.
>
> I performed the following test:
> Initially cluster is running smoothly. Just before launching a MapReduce
> job (about one or two minutes before), I shutdown one of the data nodes
> (rebooted the machine). Then my MapReduce job starts but immediately fails
> with following messages on stderr:
>
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please
> use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> files.
> NOTICE: Configuration: /device.map    /region.map    /url.map
> /data/output/2011/12/26/08
>  PS:192.168.100.206:11111    3600    true    Notice
> 11/12/26 09:10:26 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 11/12/26 09:10:26 INFO input.FileInputFormat: Total input paths to process
> : 24
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Abandoning block
> blk_-6309642664478517067_35619
> 11/12/26 09:10:37 INFO hdfs.DFSClient: Waiting to find target node:
> 192.168.100.7:50010
> 11/12/26 09:10:44 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.net.NoRouteToHostException: No route to host
> 11/12/26 09:10:44 INFO hdfs.DFSClient: Abandoning block
> blk_4129088682008611797_35619
> 11/12/26 09:10:53 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:10:53 INFO hdfs.DFSClient: Abandoning block
> blk_3596375242483863157_35619
> 11/12/26 09:11:01 INFO hdfs.DFSClient: Exception in createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as
> 192.168.100.5:50010
> 11/12/26 09:11:01 INFO hdfs.DFSClient: Abandoning block
> blk_724369205729364853_35619
> 11/12/26 09:11:07 WARN hdfs.DFSClient: DataStreamer Exception:
> java.io.IOException: Unable to create new block.
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3002)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2446)
>
> 11/12/26 09:11:07 WARN hdfs.DFSClient: Error Recovery for block
> blk_724369205729364853_35619 bad datanode[1] nodes == null
> 11/12/26 09:11:07 WARN hdfs.DFSClient: Could not get block locations.
> Source file
> "/data/hadoop-admin/mapred/staging/admin/.staging/job_201112200923_0292/job.split"
> - Aborting...
> 11/12/26 09:11:07 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://machine-100-205:9000/data/hadoop-admin/mapred/staging/admin/.staging/job_201112200923_0292
> Exception in thread "main" java.io.IOException: Bad connect ack with
> firstBadLink as 192.168.100.5:50010
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:3068)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2983)
>    at
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2255)
>    at

Alexander Lorenz
http://mapredit.blogspot.com

P Think of the environment: please don't print this email unless you
really need to.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB