Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Zookeeper >> mail # dev >> [VOTE] BookKeeper (including Hedwig) subproject proposal


Copy link to this message
-
Re: [VOTE] BookKeeper (including Hedwig) subproject proposal
I'm -0 as well - I really think that incubator is the logical place for both
BookKeeper and Hedwig, and gives them the best chance of establishing their
own communities and independence from ZooKeeper. My expectation is that both
projects will have to move out from under the ZK umbrella eventually, and
there is little to be gained from putting it off.

That said, I'm excited to see both projects moving forward, and subproject
is definitely significantly better than contrib, so I'm not prepared to veto
this.

Henry

On 21 March 2011 03:03, Ivan Kelly <[EMAIL PROTECTED]> wrote:

> +1
>
> On 18 Mar 2011, at 22:40, Utkarsh Srivastava wrote:
>
> > +1
> >
> > Utkarsh
> >
> > On Fri, Mar 18, 2011 at 2:29 PM, Mahadev Konar <[EMAIL PROTECTED]>
> wrote:
> >> +1.
> >>
> >> thanks
> >> mahadev
> >> On Fri, Mar 18, 2011 at 2:26 PM, Patrick Hunt <[EMAIL PROTECTED]> wrote:
> >>
> >>> -0. I'm all for bk/hedwig moving out from contrib, but as I stated
> earlier
> >>> I think it should move to incubator and not subproject. At the same
> time
> >>> it's important that the project can develop on it's own, so I won't
> stand in
> >>> the way.
> >>>
> >>> Patrick
> >>>
> >>>
> >>> On Fri, Mar 18, 2011 at 2:18 PM, Flavio Junqueira <[EMAIL PROTECTED]
> >wrote:
> >>>
> >>>> +1.
> >>>>
> >>>> -Flavio
> >>>>
> >>>> On Mar 18, 2011, at 10:11 PM, Benjamin Reed wrote:
> >>>>
> >>>> +1 i'm all for it of course :)
> >>>>
> >>>> On Fri, Mar 18, 2011 at 2:11 PM, Benjamin Reed <[EMAIL PROTECTED]>
> wrote:
> >>>>
> >>>> Proposal
> >>>>
> >>>>
> >>>> BookKeeper is a distributed write ahead logging (WAL) service. It is
> >>>>
> >>>> built on top of ZooKeeper and is used for distributed recovery and
> >>>>
> >>>> reliability. Much like ZooKeeper itself, BookKeeper is a distributed
> >>>>
> >>>> tool used for reliability, but unlike ZooKeeper it is used to store
> >>>>
> >>>> large amounts of application data in the form of byte streams, which
> >>>>
> >>>> we call ledgers. It is made up of Bookies, which store data, and a
> >>>>
> >>>> client library. All other meta-data is stored in ZooKeeper.
> >>>>
> >>>>
> >>>> The BookKeeper subproject also includes Hedwig, which is a pub/sub
> >>>>
> >>>> system built on both BookKeeper and ZooKeeper. It's coupling with
> >>>>
> >>>> BookKeeper is tight and many of the performance features of BookKeeper
> >>>>
> >>>> were added in response to Hedwig's requirements. Hedwig is made up of
> >>>>
> >>>> a rather thin client library and stateless Brokers that cache and
> >>>>
> >>>> distribute messages.
> >>>>
> >>>>
> >>>> Background
> >>>>
> >>>>
> >>>> BookKeeper was developed as a WAL for the Hadoop NameNode and was also
> >>>>
> >>>> used to build the Hedwig pub/sub system. Both are currently contribs
> >>>>
> >>>> to ZooKeeper. The work to get the hooks necessary to integrate
> >>>>
> >>>> BookKeeper with the NameNode is almost complete (HDFS-1580).
> >>>>
> >>>>
> >>>> Rational
> >>>>
> >>>>
> >>>> We have contributors that we would like to make committers to
> >>>>
> >>>> BookKeeper and Hedwig. It would be nice to allow a development
> >>>>
> >>>> community to grow around BookKeeper.
> >>>>
> >>>>
> >>>> Also, hudson does not run against contrib. Making BookKeeper its own
> >>>>
> >>>> subproject would allow us to better qa our changes.
> >>>>
> >>>>
> >>>> We also would like to decouple BookKeeper releases from ZooKeeper
> >>>>
> >>>> releases. ZooKeeper is quite mature and has relatively long release
> >>>>
> >>>> cycles. We would like shorter release cycles for BookKeeper.
> >>>>
> >>>>
> >>>> In theory we could make two projects BookKeeper and Hedwig, but doing
> >>>>
> >>>> so would double the project management and release overhead. The
> >>>>
> >>>> development community between BookKeeper and Hedwig overlaps heavily,
> >>>>
> >>>> so we would be increasing the burden on the same group of
> >>>>
> >>>> contributors.
> >>>>
> >>>>
> >>>> Because of the developer community overlap with ZooKeeper and the fact
> >>>>
Henry Robinson
Software Engineer
Cloudera
415-994-6679