Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # dev - Fully qualified path names in distributed log splitting.


+
lars hofhansl 2013-02-05, 07:32
+
Elliott Clark 2013-02-05, 07:39
+
lars hofhansl 2013-02-05, 07:48
Copy link to this message
-
Re: Fully qualified path names in distributed log splitting.
Matteo Bertozzi 2013-02-05, 07:39
HBASE-7723 - Remove NN URI from ZK splitlogs

On Tue, Feb 5, 2013 at 7:32 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> We just found ourselves in an interesting pickle.
>
> We were upgrading one of our clusters from HBase 0.94.0 on Hadoop 1.0.4 to
> HBase 0.94.4 on top of Hadoop 2.
> The cluster has been setup a while ago and the old shutdown script had a
> bug and shutdown HBase and HDFS uncleanly.
>
> Assuming that the log will be replayed we upgraded Hadoop to 2.0.x, and
> verified that from a file system view everything is OK.
> The new HDFS runs with an HA NameNode, so the FS changed from hdfs://<old
> host name> to hdfs://<ha cluster name>
>
>
> Then we brought up HBase and found it stuck in splitting logs forever.
> In the log we see messages like these:
> 2013-02-05 06:22:31,045 ERROR
> org.apache.hadoop.hbase.regionserver.SplitLogWorker: unexpected error
> java.lang.IllegalArgumentException:
>  Wrong FS:
> hdfs://<old NN host>/.logs/<rs host>,60020,1358540589323-splitting/<rs
> host>%2C60020%2C1358540589323.1359962644861,
>  expected: hdfs://<ha cluster name>
>         at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:547)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:169)
>         at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:783)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
>         at
> org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:163)
>         at java.lang.Thread.run(Thread.java:662)
>
> So it looks like distributed log splitting stores the full HDFS path name
> including the host, which seems unnecessary.
> This path is stored in ZK.
>
> So all in all it seems that only can happen if all the following is true:
> unclean shutdown, keeping the same ZK ensemble, changed FS.
>
>
> The data is not important, we can just blow it away, but we want to prove
> that we could recover the data if we had to.
> It seems we have three options:
>
> 1. Blow away the data in ZK under "splitlog", and restart HBase. It should
> restart the split process with the correct pathnames.
>
> 2. Temporarily change the config for the region server to set the root dir
> to hdfs://<old NN host>, bounce HBase. The log splitting should now be able
> to succeed.
> 3. Downgrade back to the old Hadoop (we kept a copy of the image).
>
> We're trying option #2, to see whether that would fix it. #1 should work
> too.
>
>
> Has anybody else experienced this?
> It seems that would also limit our ability to take a snapshot of a
> filesystem and move it to somewhere else, as the hostnames are hardcoded,
> at least in ZK for log splitting.
>
>
> -- Lars
>