Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Re: Solution for blocking fsync in 0.8

Copy link to this message
Re: Solution for blocking fsync in 0.8

Makes sense.  My concern was less per topic and more other things on the
same box (I probably want kafka to sync more often than my webserver,
but less often than a database).

On 2012-06-19 01:06, Jay Kreps wrote:
> Yes, that's right, it is a global setting so you lose the ability to have
> per-topic overrides. I think the idea, though, is with replication the real
> durability guarantee comes from the replication and the syncing is just to
> ensure data makes it to disk reasonably quickly.
> -Jay
> On Mon, Jun 18, 2012 at 6:21 PM, Chris Burroughs
>> Thanks Jay.  This is a very helpful investigation!
>> On 05/24/2012 01:40 PM, Jay Kreps wrote:
>>> Unfortunately *any* call to fsync will block appends even in a background
>>> thread so how can we give control over physical disk persistence without
>>> introducing high latency for the producer? The answer is that the linux
>>> pdflush daemon actually does a very similar thing to our flush
>> parameters.
>>> pdflush is a daemon running on every linux machine that controls the
>>> writing of buffered/cached data back to disk. It allows you to control
>> the
>>> percentage of memory filled with dirty pages by giving it either a
>>> percentage of memory, a time out for any dirty page to be written, or a
>>> fixed number of dirty bytes.
>> This would however by necessity by a global setting right?  (Assuming
>> there is no /proc trickery to change per-pid pdflush behaviour)