Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop recovery test


Copy link to this message
-
Re: Hadoop recovery test
Hi Artem,
At what point do you do the copy, was namenode still running? Does the copy
of the edits file and fsimage file match up with the original (i.e
filesize)?

-Robert

On Mon, Sep 17, 2012 at 2:38 PM, Artem Ervits <[EMAIL PROTECTED]> wrote:

>  Hello all,****
>
> ** **
>
> I am testing the Hadoop recovery as per
> http://wiki.apache.org/hadoop/NameNode document. But instead of using an
> NFS share, I am copying to another directory. Then when I shut down the
> cluster, I scp that directory to another server and start Hadoop cluster
> using that machine as the namenode. I see in the log that some blocks are
> corrupt and/or missing. Do I have to wait for replication to recover all
> blocks or am I doing something else altogether? I am using Hadoop 1.0.3.
> Can someone point me to a more detailed document than the wiki in case I’m
> doing something wrong.****
>
> ** **
>
> p.s. if I restart the cluster using the original namenode, filesystem
> reports as healthy.****
>
> ** **
>
> Thank you.****
>
> ** **
>
> .****
>
> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: CORRUPT block
> blk_9043419219670949307****
>
> ** **
>
> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks of total
> size 4 B...****
>
> /user/hduser/teragen/_logs/history/job_201209120941_0002_1347458152167_hduser_TeraGen:
> Under replicated blk_-976282286234272458_1079. Target Replicas is 3 but
> found 1 replica(s).****
>
> .****
>
> /user/hduser/teragen/_logs/history/job_201209120941_0002_conf.xml:  Under
> replicated blk_137658109390447967_1075. Target Replicas is 3 but found 1
> replica(s).****
>
> .****
>
> /user/hduser/teragen/_partition.lst:  Under replicated
> blk_-3005280481530403302_1080. Target Replicas is 3 but found 1 replica(s).
> ****
>
> .****
>
> /user/hduser/teragen/part-00000:  Under replicated
> blk_-7008813028808832816_1077. Target Replicas is 3 but found 1 replica(s).
> ****
>
> .****
>
> /user/hduser/teragen/part-00001:  Under replicated
> blk_-5256967771026054061_1078. Target Replicas is 3 but found 1 replica(s).
> ****
>
> ..****
>
> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_1347458249920_hduser_TeraSort:
> Under replicated blk_1137779303840586677_1089. Target Replicas is 3 but
> found 1 replica(s).****
>
> .****
>
> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_conf.xml:
> Under replicated blk_7701720691642589882_1086. Target Replicas is 3 but
> found 1 replica(s).****
>
> .****
>
> /user/hduser/teragen-out/part-00000: CORRUPT block blk_8059469267617478950
> ****
>
> ** **
>
> /user/hduser/teragen-out/part-00000: MISSING 1 blocks of total size
> 1000000 B...****
>
> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_1347458495941_hduser_TeraValidate:
> Under replicated blk_5680565744062298575_1098. Target Replicas is 3 but
> found 1 replica(s).****
>
> .****
>
> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_conf.xml:
> Under replicated blk_1566253937037013126_1095. Target Replicas is 3 but
> found 1 replica(s).****
>
> .Status: CORRUPT****
>
> Total size:    1050720258 B****
>
> Total dirs:    39****
>
> Total files:   32****
>
> Total blocks (validated):      42 (avg. block size 25017149 B)****
>
>   ************************************
>
>   CORRUPT FILES:        2****
>
>   MISSING BLOCKS:       2****
>
>   MISSING SIZE:         1000004 B****
>
>   CORRUPT BLOCKS:       2****
>
>   ************************************
>
> Minimally replicated blocks:   40 (95.2381 %)****
>
> Over-replicated blocks:        0 (0.0 %)****
>
> Under-replicated blocks:       40 (95.2381 %)****
>
> Mis-replicated blocks:         0 (0.0 %)****
>
> Default replication factor:    3****
>
> Average block replication:     0.95238096****
>
> Corrupt blocks:                2****
>
> Missing replicas:              80 (200.0 %)****
>
> Number of data-nodes:          1****
>
> Number of racks:               1****
>
> FSCK ended at Mon Sep 17 17:29:08 EDT 2012 in 21 milliseconds****
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB