Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # dev - Good VLDB paper on WALs


Copy link to this message
-
Re: Good VLDB paper on WALs
Nicolas Spiegelberg 2010-12-29, 22:18
+1 for ELR.

I think having some data structure where we prepare the next stage of
sync() operations instead of holding the row lock over the sync would be a
big win for hot regions without a huge refactor.  I think the other two
optimizations are useful to think about, but wouldn't have the same
impact/effort ratio as ELR.
On 12/29/10 11:32 AM, "Stack" <[EMAIL PROTECTED]> wrote:

>Nice list of things we need to do to make logging faster (with useful
>citations on current state of art).  This notion of early lock release
>(ELR) is worth looking into (Jon, for high rates of counter
>transactions, you've been talking about aggregating counts in front of
>the WAL lock... maybe an ELR and then a hold on the transaction until
>confirmation of flush would be way to go?).  Regards flush-pipelining,
>it would be interesting to see if there are traces of the sys-time
>that Dhruba is seeing in his NN out in HBase servers.  My guess is
>that its probably drowned by other context switches done in our
>servers.  Definitely worth study.
>
>St.Ack
>P.S. Minimizing context switches, a system for ELR and
>flush-pipelining, recasting the server to make use of one of the DI or
>OSGi frameworks, moving off log4j, etc..... Is it just me or do others
>feel a server rewrite coming on?
>
>
>On Mon, Dec 27, 2010 at 11:48 AM, Dhruba Borthakur <[EMAIL PROTECTED]>
>wrote:
>> HDFS currently uses Hadoop RPC and the server thread blocks till the
>>WAL is
>> written to disk. In earlier deployments, I thought we could safely
>>ignore
>> flush-pipelining by creating more server threads. But in our largest
>>HDFS
>> systems, I am starting to see  20% sys-time usage on the namenode
>>machine;
>> most of this  could be thread scheduling. If so, then it makes sense to
>> enhance the logging code to release server threads even before the WAL
>>is
>> flushed to disk (but, of course, we still have to delay the transaction
>> response to the client till the WAL is synced to disk).
>>
>> Does anybody have any idea on how to figure out what percentage of the
>>above
>> sys-time is spent in thread scheduling vs the time spent in other system
>> calls (especially in the Namenode context)?
>>
>> thanks,
>> dhruba
>>
>>
>> On Fri, Dec 24, 2010 at 8:17 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
>>
>>> Via Hammer - I thought this was a pretty good read, some good ideas for
>>> optimizations for our WAL.
>>>
>>> http://infoscience.epfl.ch/record/149436/files/vldb10aether.pdf
>>>
>>> -Todd
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>>
>>
>> --
>> Connect to me at http://www.facebook.com/dhruba
>>