Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper, mail # dev - [VOTE] BookKeeper (including Hedwig) subproject proposal


Copy link to this message
-
[VOTE] BookKeeper (including Hedwig) subproject proposal
Benjamin Reed 2011-03-18, 21:11
Proposal

BookKeeper is a distributed write ahead logging (WAL) service. It is
built on top of ZooKeeper and is used for distributed recovery and
reliability. Much like ZooKeeper itself, BookKeeper is a distributed
tool used for reliability, but unlike ZooKeeper it is used to store
large amounts of application data in the form of byte streams, which
we call ledgers. It is made up of Bookies, which store data, and a
client library. All other meta-data is stored in ZooKeeper.

The BookKeeper subproject also includes Hedwig, which is a pub/sub
system built on both BookKeeper and ZooKeeper. It's coupling with
BookKeeper is tight and many of the performance features of BookKeeper
were added in response to Hedwig's requirements. Hedwig is made up of
a rather thin client library and stateless Brokers that cache and
distribute messages.

Background

BookKeeper was developed as a WAL for the Hadoop NameNode and was also
used to build the Hedwig pub/sub system. Both are currently contribs
to ZooKeeper. The work to get the hooks necessary to integrate
BookKeeper with the NameNode is almost complete (HDFS-1580).

Rational

We have contributors that we would like to make committers to
BookKeeper and Hedwig. It would be nice to allow a development
community to grow around BookKeeper.

Also, hudson does not run against contrib. Making BookKeeper its own
subproject would allow us to better qa our changes.

We also would like to decouple BookKeeper releases from ZooKeeper
releases. ZooKeeper is quite mature and has relatively long release
cycles. We would like shorter release cycles for BookKeeper.

In theory we could make two projects BookKeeper and Hedwig, but doing
so would double the project management and release overhead. The
development community between BookKeeper and Hedwig overlaps heavily,
so we would be increasing the burden on the same group of
contributors.

Because of the developer community overlap with ZooKeeper and the fact
that BookKeeper is inline with the general mission of ZooKeeper, we
think BookKeeper should be a subproject of ZooKeeper.

Call for vote

I propose that BookKeeper become a ZooKeeper subproject subject to
ZooKeeper PMC and Bylaws. I, Benjamin Reed, will champion the
proposal. BookKeeper will have the following initial committers:

Dhruba Borthakur (Facebook)
Flavio Junqueira (Yahoo)
Ivan Kelly (Yahoo)
Benjamin Reed (Yahoo)
Utkarsh Srivastava (Twitter)