|
|
-
Is it possible to read a corrupted Sequence File?
Hs 2012-11-24, 02:08
Hi,
I am running hadoop 1.0.3 and hbase-0.94.0on a 12-node cluster. For unknown operational faults, 6 datanodes have suffered a complete data loss(hdfs data directory gone). When I restart hadoop, it reports "*The ratio of reported blocks 0.8252*".
I have a folder in hdfs containing many important files in hadoop SequenceFile format. The hadoop fsck tool shows that (in this folder)
Total size: 134867556461 B Total dirs: 16 Total files: 251 Total blocks (validated): 2136 (avg. block size 63140241 B) ******************************** CORRUPT FILES: 167 MISSING BLOCKS: 405 MISSING SIZE: 25819446263 B CORRUPT BLOCKS: 405 ********************************
I wonder if I can read these corrupted SequenceFiles with missing blocks skipped ? Or, what else can I do now to recover these SequenceFiles as much as possible ?
Please save me.
Thanks !
-
Re: Is it possible to read a corrupted Sequence File?
Radim Kolar 2012-11-24, 03:12
> I wonder if I can read these corrupted SequenceFiles with missing blocks > skipped ? Its possible to recover existing blocks and repair seq file structure.
+
Radim Kolar 2012-11-24, 03:12
-
Re: Is it possible to read a corrupted Sequence File?
Hs 2012-11-24, 03:55
Could you please provide a little more detail? Should I run "hadoop fsck / -move " first to move broken files into /lost+found and then repair them? Or, I can repair them directly in current path? Thanks! 2012/11/24 Radim Kolar <[EMAIL PROTECTED]>
> > I wonder if I can read these corrupted SequenceFiles with missing blocks >> skipped ? >> > Its possible to recover existing blocks and repair seq file structure. >
-
Re: Is it possible to read a corrupted Sequence File?
Radim Kolar 2012-11-24, 16:32
> Could you please provide a little more detail? Its low level task, repairing seq files with header missing is not so easy. But till date we were able to repair pretty much everything people needed including corrupted hbase metadata. But its few days of work, which is far beyond you can get for free in community mailing list. > Should I run "hadoop fsck -move " first to move broken files into /lost+found and then repair > them? no, it does more harm then good. Best advice is never try to do any data recovery yourself and seek an expert.
But if cost of expert work is more then cost of data, then yes why not to try it? I never used that myself, but you probably do not have much other choices.
+
Radim Kolar 2012-11-24, 16:32
-
Re: Is it possible to read a corrupted Sequence File?
Radim Kolar 2012-11-24, 17:19
also if you will be seek an expert help disconnect drives with disappearing data, they will be still there in good shape as long they are not overwritten. and of course do not delete any fsimage from name node.
+
Radim Kolar 2012-11-24, 17:19
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext