Thanks for the detailed design document and the in-depth walkthrough [1]!
Your proposal seems to be sound. (But be warned, I don’t have much experience in this part of Aurora or Mesos :-))


On 31.08.17, 04:18, "Jordan Ly" <[EMAIL PROTECTED]> wrote:

    Hi everyone,
    Following up on the discussion here:
    I've created a design document detailing the implementation of a "hot
    standby" mechanism where scheduler followers would eagerly read and
    apply entries from the replicated log. The goal of this change is
    that, in the event of a failover, the newly elected follower will not
    have to replay as many entries to rebuild its state and thus can start
    serving traffic faster.
    I have a working prototype of the above design running on a test
    cluster. Please feel free to comment on the doc!
    This document references a current proposal in Mesos by Ilya Pronin
    Jordan Ly
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB