Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Defining Hadoop Compatibility -revisiting-


Copy link to this message
-
Re: Defining Hadoop Compatibility -revisiting-
What does it mean to "implement" those interfaces? I'm +1 for a TCK-based
definition. In addition to statically implementing a set of interfaces, each
interface also implicitly includes a set of acceptable inputs and predicted
outputs (or ranges of outputs) for those inputs.

- Aaron

On Wed, May 11, 2011 at 3:56 PM, Jacob R Rideout <[EMAIL PROTECTED]>wrote:

> What about defining compatibility as fully implementing all the
> public-stable annotated interfaces for a particular release?
>
> Jacob Rideout
>
> On Wed, May 11, 2011 at 4:42 PM, Ian Holsman <[EMAIL PROTECTED]> wrote:
> > For apache (httpd I'm assuming you mean). we define compatibility as
> adherence to the set of RFC's that define the HTTP protocol.
> >
> > I'm no expert in this (Roy is though), but we could attempt to do
> something similar when it comes to HDFS/Map-Reduce protocols. I'm not sure
> what benefit there would be to going to a RFC, as opposed to documenting the
> API on our site.
> >
> >
> > On May 12, 2011, at 7:24 AM, Eric Baldeschwieler wrote:
> >
> >> This is a really interesting topic!  I completely agree that we need to
> get ahead of this.
> >>
> >> I would be really interested in learning of any experience other apache
> projects, such as apache or tomcat have with these issues.
> >>
> >> ---
> >> E14 - typing on glass
> >>
> >> On May 10, 2011, at 6:31 AM, "Steve Loughran" <[EMAIL PROTECTED]>
> wrote:
> >>
> >>>
> >>> Back in Jan 2011, I started a discussion about how to define Apache
> >>> Hadoop Compatibility:
> >>>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%[EMAIL PROTECTED]%3E
> >>>
> >>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet
> >>>
> >>>
> http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.pdf
> >>>
> >>> It claims that their implementations are 100% compatible, even though
> >>> the Enterprise edition uses a C filesystem. It also claims that both
> >>> their software releases contain "Certified Stacks", without defining
> >>> what Certified means, or who does the certification -only that it is an
> >>> improvement.
> >>>
> >>>
> >>> I think we should revisit this issue before people with their own
> >>> agendas define what compatibility with Apache Hadoop is for us
> >>>
> >>>
> >>> Licensing
> >>> -Use of the Hadoop codebase must follow the Apache License
> >>> http://www.apache.org/licenses/LICENSE-2.0
> >>> -plug in components that are dynamically linked to (Filesystems and
> >>> schedulers) don't appear to be derivative works on my reading of this,
> >>>
> >>> Naming
> >>> -this is something for branding@apache, they will have their opinions.
> >>> The key one is that the name "Apache Hadoop" must get used, and it's
> >>> important to make clear it is a derivative work.
> >>> -I don't think you can claim to have a Distribution/Fork/Version of
> >>> Apache Hadoop if you swap out big chunks of it for alternate
> >>> filesystems, MR engines, etc. Some description of this is needed
> >>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem XYZ"
> >>>
> >>> Compatibility
> >>> -the definition of the Hadoop interfaces and classes is the Apache
> >>> Source tree,
> >>> -the definition of semantics of the Hadoop interfaces and classes is
> >>> the Apache Source tree, including the test classes.
> >>> -the verification that the actual semantics of an Apache Hadoop
> >>> release is compatible with the expected semantics is that current and
> >>> future tests pass
> >>> -bug reports can highlight incompatibility with expectations of
> >>> community users, and once incorporated into tests form part of the
> >>> compatibility testing
> >>> -vendors can claim and even certify their derivative works as
> >>> compatible with other versions of their derivative works, but cannot
> >>> claim compatibility with Apache Hadoop unless their code passes the
> >>> tests and is consistent with the bug reports marked as ("by design").
> >>> Perhaps we should have tests that verify each of these "by design"
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB