Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> [VOTE] BookKeeper (including Hedwig) subproject proposal

Copy link to this message
Re: [VOTE] BookKeeper (including Hedwig) subproject proposal


On Mar 18, 2011, at 10:11 PM, Benjamin Reed wrote:

> +1 i'm all for it of course :)
> On Fri, Mar 18, 2011 at 2:11 PM, Benjamin Reed <[EMAIL PROTECTED]>  
> wrote:
>> Proposal
>> BookKeeper is a distributed write ahead logging (WAL) service. It is
>> built on top of ZooKeeper and is used for distributed recovery and
>> reliability. Much like ZooKeeper itself, BookKeeper is a distributed
>> tool used for reliability, but unlike ZooKeeper it is used to store
>> large amounts of application data in the form of byte streams, which
>> we call ledgers. It is made up of Bookies, which store data, and a
>> client library. All other meta-data is stored in ZooKeeper.
>> The BookKeeper subproject also includes Hedwig, which is a pub/sub
>> system built on both BookKeeper and ZooKeeper. It's coupling with
>> BookKeeper is tight and many of the performance features of  
>> BookKeeper
>> were added in response to Hedwig's requirements. Hedwig is made up of
>> a rather thin client library and stateless Brokers that cache and
>> distribute messages.
>> Background
>> BookKeeper was developed as a WAL for the Hadoop NameNode and was  
>> also
>> used to build the Hedwig pub/sub system. Both are currently contribs
>> to ZooKeeper. The work to get the hooks necessary to integrate
>> BookKeeper with the NameNode is almost complete (HDFS-1580).
>> Rational
>> We have contributors that we would like to make committers to
>> BookKeeper and Hedwig. It would be nice to allow a development
>> community to grow around BookKeeper.
>> Also, hudson does not run against contrib. Making BookKeeper its own
>> subproject would allow us to better qa our changes.
>> We also would like to decouple BookKeeper releases from ZooKeeper
>> releases. ZooKeeper is quite mature and has relatively long release
>> cycles. We would like shorter release cycles for BookKeeper.
>> In theory we could make two projects BookKeeper and Hedwig, but doing
>> so would double the project management and release overhead. The
>> development community between BookKeeper and Hedwig overlaps heavily,
>> so we would be increasing the burden on the same group of
>> contributors.
>> Because of the developer community overlap with ZooKeeper and the  
>> fact
>> that BookKeeper is inline with the general mission of ZooKeeper, we
>> think BookKeeper should be a subproject of ZooKeeper.
>> Call for vote
>> I propose that BookKeeper become a ZooKeeper subproject subject to
>> ZooKeeper PMC and Bylaws. I, Benjamin Reed, will champion the
>> proposal. BookKeeper will have the following initial committers:
>> Dhruba Borthakur (Facebook)
>> Flavio Junqueira (Yahoo)
>> Ivan Kelly (Yahoo)
>> Benjamin Reed (Yahoo)
>> Utkarsh Srivastava (Twitter)


research scientist

direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301