Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Is it possible to read a corrupted Sequence File


+
Hs 2012-11-24, 02:12
Copy link to this message
-
Re: Is it possible to read a corrupted Sequence File
I guess one way might be to write your own dfs reader that ignores The exceptions and reads whatever it can

Sent from my iPad

On Nov 23, 2012, at 6:12 PM, Hs <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I am running hadoop 1.0.3 and hbase-0.94.0on a 12-node cluster. For unknown operational faults, 6 datanodes  have suffered a complete data loss(hdfs data directory gone).  When I restart hadoop, it reports "The ratio of reported blocks 0.8252".
>
> I have a folder in hdfs containing many important files in hadoop SequenceFile format. The hadoop fsck tool shows that  (in this folder)
>
> Total size:    134867556461 B
>  Total dirs:    16
>  Total files:   251
>  Total blocks (validated):      2136 (avg. block size 63140241 B)
>   ********************************
>   CORRUPT FILES:        167
>   MISSING BLOCKS:       405
>   MISSING SIZE:         25819446263 B
>   CORRUPT BLOCKS:       405
>   ********************************
>
> I wonder if I can read these corrupted SequenceFiles with missing blocks skipped ?  Or, what else can I do now to recover these SequenceFiles as much as possible ?
>
> Please save me.
>
> Thanks !
>
> (Sorry for duplicating this post on user and hdfs-dev list, I do not know where exactly i should put it.)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB