Thanks for mentioning that ticket.
I got some partially working prototype working but still a long way from
getting it into production ready. I felt like scaling storage is nice to
have. Zookeeper provide subscription-model so it is efficient to store
data which low change rate but need fast propagation. Our internal service
discovery and configuration distribution rely on this property. Without
scaling the storage, you will end up need to build another distribution
channel for the actually data if you only put metadata into Zookeeper.
There are 2 major issues with external storage which I remembered at the
top of my head with BerkleyDb integration.
1. Object serialization affect read throughput, since we need to
deserialize znode object on every read request. We might be able to
address this via extra caching
2. Zookeeper validate all data entries before serving traffic, so it will
be able to detect data corruption and use backup or fetch from other
servers. However, DB and large data, we can only do simple integrity check
and it may fail to return some entry if corruption is detected once the
server is online. This is a new failure condition that doesn¹t exist
before and I am not sure how to handle.
On 5/23/14, 11:23 AM, "Michi Mutsuzaki" <[EMAIL PROTECTED]> wrote: