-Re: Hlog Group Commit Question: SequenceFileLogReader
Nicolas Spiegelberg 2010-01-26, 23:50
Sounds like I also have 0.20.3RC merging as well. Version mistake on my
behalf. We have a 0.20.3-dev that must've missed the 'syncfs' changes. I
was merging based solely on the 0.21 trunk. I'll look into the 0.20.3 code
I have done the HDFS-200 merge & the trunk group commit, now I need to
reconcile that with the 0.20.3RC code since we don't currently plan on
On 1/26/10 2:02 PM, "Stack" <[EMAIL PROTECTED]> wrote:
> HBase 0.20 had a hack that would recognize the presence of Dhruba's
> HDFS-200. If it had been applied, then we'd do the open-for-append,
> close, and reopen to recover edits written to an unclosed WAL/HLog
> file (Grep 'syncfs' in HLog on the 0.20 branch).
> In HBase TRUNK, the above hackery was stripped out. In TRUNK we are
> leaning on the new hflush/HDFS-265 rather than HDFS-200. For hflush,
> when we do FSDataInputStream::available(), its returning the 'right'
> answer (WALReaderFsDataInputStream::getPos() was added before an API
> was available. HBASE-2069 is about using the new API instead of this
> getPos fancy-dancing).
> It sounds like you need to do a bit of merging of TRUNK group commit
> and the old hbase code that exploited HDFS-200?
> On Tue, Jan 26, 2010 at 12:35 PM, Nicolas Spiegelberg
> <[EMAIL PROTECTED]> wrote:
>> I am trying to backport the HLog group commit functionality to Hbase 0.20.
>> For proper reliability, I am working with Dhruba to get the 0.21 syncFs()
>> changes from HDFS ported back to HDFS 0.20 as well. When going through a
>> peer review of the modified code, my group had a question about the
>> SequenceFileLogReader.java (WALReader). I am hoping that you guys could be
>> of assistance.
>> I know that there is an open issue [HBASE-2069] where Hlog::splitLog() does
>> not call DFSDataInputStream::getVisibleLength(), which would properly sync
>> hflushed, but unclosed, file lengths. I believe the current workaround is to
>> open an HDFS file in append mode & then close, which would cause the namenode
>> to get updates from the datanodes. However, I don¹t see that shim present in
>> Hlog::splitLog() on the 0.21 trunk. Is this a pending issue to fix or is
>> calling FSDataInputStream::available() within
>> WALReaderFsDataInputStream::getPos() sufficient to force the namenode to sync
>> up with the datanodes?
>> Nicolas Spiegelberg