Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Hlog Group Commit Question: SequenceFileLogReader


Copy link to this message
-
Re: Hlog Group Commit Question: SequenceFileLogReader
Sounds like I also have 0.20.3RC merging as well.  Version mistake on my
behalf.  We have a 0.20.3-dev that must've missed the 'syncfs' changes.  I
was merging based solely on the 0.21 trunk.  I'll look into the 0.20.3 code
some more.

I have done the HDFS-200 merge & the trunk group commit, now I need to
reconcile that with the 0.20.3RC code since we don't currently plan on
merging HDFS-265.

Thanks!
Nicolas Spiegelberg

On 1/26/10 2:02 PM, "Stack" <[EMAIL PROTECTED]> wrote:

> HBase 0.20 had a hack that would recognize the presence of Dhruba's
> HDFS-200.  If it had been applied, then we'd do the open-for-append,
> close, and reopen to recover edits written to an unclosed WAL/HLog
> file (Grep 'syncfs' in HLog on the 0.20 branch).
>
> In HBase TRUNK, the above hackery was stripped out.  In TRUNK we are
> leaning on the new hflush/HDFS-265 rather than HDFS-200.  For hflush,
> when we do FSDataInputStream::available(), its returning the 'right'
> answer (WALReaderFsDataInputStream::getPos() was added before an API
> was available.  HBASE-2069 is about using the new API instead of this
> getPos fancy-dancing).
>
> It sounds like you need to do a bit of merging of TRUNK group commit
> and the old hbase code that exploited HDFS-200?
>
> St.Ack
>
> On Tue, Jan 26, 2010 at 12:35 PM, Nicolas Spiegelberg
> <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> I am trying to backport the HLog group commit functionality to Hbase 0.20.
>>  For proper reliability, I am working with Dhruba to get the 0.21 syncFs()
>> changes from HDFS ported back to HDFS 0.20 as well.  When going through a
>> peer review of the modified code, my group had a question about the
>> SequenceFileLogReader.java (WALReader).  I am hoping that you guys could be
>> of assistance.
>>
>> I know that there is an open issue [HBASE-2069] where Hlog::splitLog() does
>> not call DFSDataInputStream::getVisibleLength(), which would properly sync
>> hflushed, but unclosed, file lengths.  I believe the current workaround is to
>> open an HDFS file in append mode & then close, which would cause the namenode
>> to get updates from the datanodes.  However, I don¹t see that shim present in
>> Hlog::splitLog() on the 0.21 trunk.  Is this a pending issue to fix or is
>> calling FSDataInputStream::available() within
>> WALReaderFsDataInputStream::getPos() sufficient to force the namenode to sync
>> up with the datanodes?
>>
>> Nicolas Spiegelberg
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB