Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> [DISCUSSION] Proposal for making core Hadoop changes


Copy link to this message
-
Re: [DISCUSSION] Proposal for making core Hadoop changes
Eli,

Just checking on the status of this proposal.

In the past I was hesitant about introducing more formalities.
I now think we really need some mechanism for
new feature and project proposals, also tracking decisions.
For the reasons exactly as you describe in your email.
Whether it is going to be HEP or something else, it is best
if we adopt it soon.

Thanks,
--Konstantin
On 5/21/2010 1:42 PM, Eli Collins wrote:
> As HDFS and MapReduce have matured the cost and complexity of
> introducing features has grown. Each new feature has to consider
> interactions with a growing set of existing features, a growing user
> base (upgrades, backwards compatibility) and additional use cases
> (more and more projects now build on them). At the same time we don't
> want the high bar for contribution to unnecessarily hinder new
> development and releases.
>
> Many projects at a similar stage address this by adopting a more
> formal way to describe, socialize and shepherd enhancements to their
> platforms. Today, new features are often discussed via an umbrella
> jira, which may have an attached design document. There are a number
> of issues with this approach. The design documents vary in format and
> quality, and are often reviewed by a limited audience. They aren't
> version controlled. Sometimes the proposal is only partially
> specified. Jiras are often ignored. Understanding a proposal and it's
> implications through a series of threads in the jira comments is
> difficult. It's hard for contributors and users to find these
> top-level jiras and follow their status.
>
> I'd like to propose that core Hadoop adopts something similar to
> Python's PEP (Python Enhancement Proposal) [1]. A "HEP" would be a
> single primary mechanism for proposing new features, incorporating
> community feedback, and recording decisions. The author of the HEP
> would be responsible for building consensus and moving the feature
> forward. Similarly, some subset of the community would be responsible
> for reviewing HEPs in a timely manner and identifying missing pieces
> in the proposal. Discussion would occur before patches showed up on
> jira. People interested in the core Hadoop roadmap could keep an eye
> on the HEPs without the overhead of following jira traffic.
>
> Why base this on the PEP? The format has proven useful to a
> substantial existing project, and I think the workflow is not too
> heavy-weight, and well-suited to a community such as ours. That being
> said, we could discuss other models (eg Java's JSR).
>
> Before we get into specifics, is this something the community would
> like to adopt in some form? Does adapting the PEP and its workflow to
> our projects, community and bylaws seem reasonable?
>
> Thanks,
> Eli
>
> 1. http://www.python.org/dev/peps/pep-0001
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB