OZAWA Tsuyoshi 2011-09-23, 13:08
Andrew Purtell 2011-09-23, 16:16
OZAWA Tsuyoshi 2011-09-23, 16:45
Ted Dunning 2011-09-23, 20:52
Edward Capriolo 2011-09-23, 23:38
OZAWA Tsuyoshi 2011-09-24, 01:54
Ryan Rawson 2011-09-24, 02:09
OZAWA Tsuyoshi 2011-09-24, 05:38
OZAWA Tsuyoshi 2011-09-24, 01:44
OZAWA Tsuyoshi 2011-09-25, 08:14
-Re: [announce] Accord: A high-performance coordination service for write-intensive workloads
Thanks for sending this reference to the list, it sounds very
interesting. I have a few questions and comments, if you don't mind:
1- I was wondering if you can give more detail on the setup you used
to generate the numbers you show in the graphs on your Accord page.
The ZooKeeper values are way too low, and I suspect that you're using
a single hard drive. It could be because you expect to use a single
hard drive with an Accord server, and you wanted to make the
comparison fair. Is this correct?
2- The previous observation leads me to the next question: could you
say more about your use of disk with persistence on?
3- The limitation on the message size in ZooKeeper is not a
fundamental limitation. We have chosen to limit for the reasons we
explain in the wiki page that is linked in the Accord page. Do you
have any particular use case in mind for which you think it would be
useful to have very large messages?
4- If I understand the group communication substrate Accord uses, it
enables Accord to process client requests in any server. ZooKeeper has
a leader for a few reasons, one being the ability of managing client
sessions. Ephemeral nodes, for example, are bound to sessions. Are
there similar abstractions in Accord? If the answer is positive, could
you explain it a bit? If not, is it doable with the substrate you're
5- I'm not sure where we say that 8 bytes is a typical value in the
documentation. I actually remember writing in one of our papers that a
typical value is around 1k bytes.
On Sep 23, 2011, at 4:22 PM, OZAWA Tsuyoshi wrote:
> Sending zookeeper-users and hbase-users ml since there may be some
> cluster developers interested in participating in this project there.
> I am pleased to announce the initial release of Accord, yet another
> coordination service like Apache ZooKeeper.
> ZooKeeper is a de facto standard coordination kernel as you know at
> Accord provides ZK-like features as a coordination service. Concretely
> speaking, it features:
> - Accord is a distributed, transactional, and fully-replicated (No
> Key-Value Store with strong consistency.
> - Accord can be scale-out up to tens of nodes.
> - Accord servers can handle tens or thousands of clients.
> - The changes for a write request from a client can be notified to the
> other clients.
> - Accord detects events of client's joining/leaving, and notifies
> joined/left client information to the other clients.
> There are some problems in ZK, however, as follows:
> - ZK cannot handle write-intensive workloads well. ZK forwards all
> requests to a master server. It may be bottleneck in write-intensive
> - ZK is optimized for disk-persistence mode, not for in-memory mode.
> ZOOKEEPER-866 shows that ZK has the other bottleneck outside disk
> persistence, though there are some needs of a fully-replicated storage
> with both strong consistency and low latency.
> - Limited Transaction APIs. ZK can only issue write operations (write,
> del) in a transaction(multi-update).
> These restriction limit the capability of the coordination kernel.
> Accord solves such problems.
> 1. Accord uses Corosync Cluster Engine as a total-order messaging
> infrastructure instead of Zab, an atomic broadcast protocol ZK uses.
> engine enable any servers to accept and process requests.
> 2. Accord supports in-memory mode.
> 3. More flexible transaction support. Not only write, del operations,
> but also cmp, copy, read operations are supported in transaction
> These differences of the core engine (1, 2) enable us to avoid master
> bottleneck. Benchmark demonstrates that the write-operation throughput
> of Accord is much higher than one of ZooKeeper
> (up to 20 times better throughput at persistent mode, and up to 18
> better throughput at in-memory mode).
> The high performance kernel can extend the application ranges. Assumed
direct +34 93-183-8828
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300 fax (408) 349 3301
OZAWA Tsuyoshi 2011-09-25, 07:02
Ted Dunning 2011-09-25, 11:14
Tsuyoshi OZAWA 2011-09-26, 02:18
Ted Dunning 2011-09-26, 11:56
Tsuyoshi OZAWA 2011-09-27, 02:30
Flavio Junqueira 2011-09-25, 21:49
Ted Dunning 2011-09-25, 22:26
Tsuyoshi OZAWA 2011-09-27, 02:19