Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Problem with BackupNode?


Copy link to this message
-
Re: Problem with BackupNode?
Hi Ivan,

Sorry for taking long time to answer your email. I did the test as you asked
and I found the commit below as the one that caused the breakage.  I wish I
could provide a fix, but I do not have time for today.
commit 27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3
Author: Hairong Kuang <[EMAIL PROTECTED]>
Date:   Mon Apr 11 17:15:27 2011 +0000

    HDFS-1630. Support fsedits checksum. Contrbuted by Hairong Kuang.
    git-svn-id:
https://svn.apache.org/repos/asf/hadoop/hdfs/trunk@109113113f79535-47bb-0310-9956-ffa450edef68
Regards,
André Oriani
On Thu, Jun 16, 2011 at 07:31, Ivan Kelly <[EMAIL PROTECTED]> wrote:

> This seems to have been introduced here:
> https://github.com/apache/**hadoop-hdfs/commit/**
> 27b956fa62ce9b467ab7dd287dd6dc**d5ab6a0cb3#src/java/org/**
> apache/hadoop/hdfs/server/**namenode/BackupImage.java<https://github.com/apache/hadoop-hdfs/commit/27b956fa62ce9b467ab7dd287dd6dcd5ab6a0cb3#src/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java>
> The backup streams never write the version, so it should never try to read
> it either. I would have expected this to fail earlier as it's reading junk
> since the stream pointer is a int past where it should be. BackupStreams
> don't write the checksum either. This really should have failed the
> BackupNode unit test, but I think there other problems with that. cf.
> https://issues.apache.org/**jira/browse/HDFS-1521?**
> focusedCommentId=13010242&**page=com.atlassian.jira.**
> plugin.system.issuetabpanels:**comment-tabpanel#comment-**13010242<https://issues.apache.org/jira/browse/HDFS-1521?focusedCommentId=13010242&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13010242>
>
> Could you try again with code from April 10th.
>
> Another candidate for causing it could be HDFS-2003 which went in on the
> 8th of this month.
>
>
>
>
>
>
> On 16/06/2011 00:42, André Oriani wrote:
>
>> Hi,
>>
>> My repo is one week old  and the change I did was to modify the
>> Configuration object at BackupNode.initialize() to make the name and edit
>> dirs to other directories, so I could run both namenode and backup node in
>> the same machine.  When I copied a file to HDFS, the follow exception was
>> below was thrown. Have anyone seem that ?
>>
>>
>> 11/06/15 17:52:22 INFO ipc.Server: IPC Server handler 1 on 50100, call
>> journal(NamenodeRegistration(**localhost:8020, role=NameNode), 101, 164,
>> [B@3951f910), rpc version=1, client version=5,
>> methodsFingerPrint=302283637
>> from 192.168.1.102:56780: error: java.io.IOException: Error replaying
>> edit
>> log at offset 13
>> Recent opcode offsets: 1
>> java.io.IOException: Error replaying edit log at offset 13
>> Recent opcode offsets: 1
>> at
>> org.apache.hadoop.hdfs.server.**namenode.FSEditLogLoader.**
>> loadEditRecords(**FSEditLogLoader.java:514)
>>  at
>> org.apache.hadoop.hdfs.server.**namenode.BackupImage.journal(**
>> BackupImage.java:242)
>> at
>> org.apache.hadoop.hdfs.server.**namenode.BackupNode.journal(**
>> BackupNode.java:251)
>>  at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native Method)
>> at
>> sun.reflect.**NativeMethodAccessorImpl.**invoke(**
>> NativeMethodAccessorImpl.java:**39)
>>  at
>> sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
>> DelegatingMethodAccessorImpl.**java:25)
>> at java.lang.reflect.Method.**invoke(Method.java:597)
>>  at
>> org.apache.hadoop.ipc.**WritableRpcEngine$Server.call(**
>> WritableRpcEngine.java:422)
>> at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:**1496)
>>  at org.apache.hadoop.ipc.Server$**Handler$1.run(Server.java:**1492)
>> at java.security.**AccessController.doPrivileged(**Native Method)
>>  at javax.security.auth.Subject.**doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.**UserGroupInformation.doAs(**
>> UserGroupInformation.java:**1131)
>>  at org.apache.hadoop.ipc.Server$**Handler.run(Server.java:1490)
>> Caused by: org.apache.hadoop.fs.**ChecksumException: Transaction 1 is
>> corrupt.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB