Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: hadoop missing file?


Copy link to this message
-
Re: hadoop missing file?
(10-3) * 129 = 903

But long answer
1) which missing file?
2) how do you know it is missing?

You have a cluster with 3 datanodes, the default replication factor is 3
but not for the job jar which is 10 (mapred.submit.replication).
Let's say you ran 129 jobs who failed in a weird way (like at submission),
you would have 129 under-replicated blocks (one block per jar because your
jar is small) and 903 missing replicas because with 3 datanodes you can't
have more than 3 replicas anyway.

So back to the first question : which missing file?
It might only be that the file hasn't be uploaded in the first place. It
happens.

For all your blocks, you do have at least one replica : Minimally
replicated blocks:   5651 (100.0 %)

Bertrand

On Tue, Jul 30, 2013 at 8:54 AM, ch huang <[EMAIL PROTECTED]> wrote:

> one of my workmate told me some of his file missing ,i use fs check find
> following info , how can i prevent  them from missing? anyone can help me?
>
> Status: HEALTHY
>  Total size:    272020850157 B (Total open files size: 652056 B)
>  Total dirs:    1143
>  Total files:   1886 (Files currently being written: 2)
>  Total blocks (validated):      5651 (avg. block size 48136763 B) (Total
> open file blocks (not validated): 1)
>  Minimally replicated blocks:   5651 (100.0 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       129 (2.2827818 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    3
>  Average block replication:     3.0
>  Corrupt blocks:                0
>  Missing replicas:              903 (5.0571237 %)
>  Number of data-nodes:          3
>  Number of racks:               1
> FSCK ended at Tue Jul 30 14:38:01 CST 2013 in 462 milliseconds
>

--
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB