Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # dev >> Fully qualified path names in distributed log splitting.


+
lars hofhansl 2013-02-05, 07:32
Copy link to this message
-
Re: Fully qualified path names in distributed log splitting.
HBASE-7723 <https://issues.apache.org/jira/browse/HBASE-7723> attempts to
fix this.  The issue arises when moving from standard nn to HA and back.
On Mon, Feb 4, 2013 at 11:32 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> We just found ourselves in an interesting pickle.
>
> We were upgrading one of our clusters from HBase 0.94.0 on Hadoop 1.0.4 to
> HBase 0.94.4 on top of Hadoop 2.
> The cluster has been setup a while ago and the old shutdown script had a
> bug and shutdown HBase and HDFS uncleanly.
>
> Assuming that the log will be replayed we upgraded Hadoop to 2.0.x, and
> verified that from a file system view everything is OK.
> The new HDFS runs with an HA NameNode, so the FS changed from hdfs://<old
> host name> to hdfs://<ha cluster name>
>
>
> Then we brought up HBase and found it stuck in splitting logs forever.
> In the log we see messages like these:
> 2013-02-05 06:22:31,045 ERROR
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: unexpected error
> java.lang.IllegalArgumentException:
>  Wrong FS:
> hdfs://<old NN host>/.logs/<rs host>,60020,1358540589323-splitting/<rs
> host>%2C60020%2C1358540589323.1359962644861,
>  expected: hdfs://<ha cluster name>
>         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:163)
>         at java.lang.Thread.run(Thread.java:662)
>
> So it looks like distributed log splitting stores the full HDFS path name
> including the host, which seems unnecessary.
> This path is stored in ZK.
>
> So all in all it seems that only can happen if all the following is true:
> unclean shutdown, keeping the same ZK ensemble, changed FS.
>
>
> The data is not important, we can just blow it away, but we want to prove
> that we could recover the data if we had to.
> It seems we have three options:
>
> 1. Blow away the data in ZK under "splitlog", and restart HBase. It should
> restart the split process with the correct pathnames.
>
> 2. Temporarily change the config for the region server to set the root dir
> to hdfs://<old NN host>, bounce HBase. The log splitting should now be able
> to succeed.
> 3. Downgrade back to the old Hadoop (we kept a copy of the image).
>
> We're trying option #2, to see whether that would fix it. #1 should work
> too.
>
>
> Has anybody else experienced this?
> It seems that would also limit our ability to take a snapshot of a
> filesystem and move it to somewhere else, as the hostnames are hardcoded,
> at least in ZK for log splitting.
>
>
> -- Lars
>
+
lars hofhansl 2013-02-05, 07:48
+
Matteo Bertozzi 2013-02-05, 07:39
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB