Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Issue of FSImage, need help


Copy link to this message
-
Re: Issue of FSImage, need help
*Root cause*: Wrong FSImage format when user killed hdfs process. It may
read invalid block
number, may be 1 billion or more, OutOfMemoryError happens before
EOFException.

How can we provide the validity of FSImage file?

--regards
Denny Ye

On Tue, Jun 28, 2011 at 4:44 PM, mac fang <[EMAIL PROTECTED]> wrote:

> Hi, Team,
>
> What we found when we use the Hadoop is, the FSImage often currupts when we
> do start/stop the Hadoop cluster. The reason we think might be around the
> write to the outputstream: the NameNode may be killed when it
> saveNamespace,
> then the FsImage file doesn't complete writing. Currently i saw a
> previous.checkpoint folder, the logic of saveNamespace is like:
>
> 1. mv the current folder to the previous.checkpoint folder.
> 2. start to write the FSImage into the current folder.
>
> I think there mightbe a case if the FSImage is currupted, the NameNode can
> NOT be started, but we can NOT get any EOFException, since we might
> encounter the OutofMemory exception if we read the wrong numBlocks and
> instantiate the Blocks [] blocks = new Blocks[numBlocks] (actually, we face
> this issue).
>
> Any suggestion to it?
>
> thanks
> macf
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB