Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # general >> Defining Hadoop Compatibility -revisiting-


+
Steve Loughran 2011-05-10, 10:29
+
Andrew Purtell 2011-05-11, 07:43
+
Steve Loughran 2011-05-11, 10:34
+
Eric Baldeschwieler 2011-05-11, 21:24
+
Milind Bhandarkar 2011-05-11, 21:46
+
M. C. Srivas 2011-05-12, 02:26
+
Ted Dunning 2011-05-12, 04:37
+
Steve Loughran 2011-05-12, 09:32
+
Segel, Mike 2011-05-12, 09:49
+
Eric Baldeschwieler 2011-05-13, 05:05
+
Milind Bhandarkar 2011-05-12, 16:45
+
Konstantin Boudnik 2011-05-12, 22:30
+
Milind Bhandarkar 2011-05-13, 03:40
+
Konstantin Boudnik 2011-05-13, 06:24
+
Milind Bhandarkar 2011-05-13, 07:11
+
Konstantin Boudnik 2011-05-13, 17:47
+
Ian Holsman 2011-05-11, 22:42
+
Jacob R Rideout 2011-05-11, 22:56
+
Aaron Kimball 2011-05-11, 23:20
+
Steve Loughran 2011-05-12, 09:33
+
Konstantin Boudnik 2011-05-12, 22:26
+
Milind Bhandarkar 2011-05-13, 03:37
+
Ted Dunning 2011-05-13, 04:05
+
Milind Bhandarkar 2011-05-13, 04:52
+
Ted Dunning 2011-05-13, 05:38
+
Konstantin Boudnik 2011-05-13, 06:12
+
Milind Bhandarkar 2011-05-13, 06:57
+
Eric Baldeschwieler 2011-05-16, 05:34
+
Steve Loughran 2011-05-16, 10:50
+
Steve Loughran 2011-05-12, 09:23
+
Allen Wittenauer 2011-05-12, 16:45
+
Doug Cutting 2011-05-13, 06:16
+
Milind Bhandarkar 2011-05-13, 07:24
+
Doug Cutting 2011-05-13, 08:53
+
Ted Dunning 2011-05-13, 13:43
+
Doug Cutting 2011-05-13, 14:50
+
Nathan Roberts 2011-05-13, 15:19
+
Allen Wittenauer 2011-05-13, 17:28
+
Segel, Mike 2011-05-13, 17:32
+
Doug Cutting 2011-05-13, 21:55
+
Allen Wittenauer 2011-05-13, 22:13
+
Doug Cutting 2011-05-13, 22:16
+
Allen Wittenauer 2011-05-13, 22:17
+
Doug Cutting 2011-05-13, 22:22
+
Steve Loughran 2011-05-16, 11:15
+
Eli Collins 2011-05-13, 22:18
+
Ted Dunning 2011-05-13, 22:53
+
Allen Wittenauer 2011-05-13, 22:57
+
Steve Loughran 2011-05-16, 11:01
+
Segel, Mike 2011-05-16, 12:00
+
Steve Loughran 2011-05-16, 14:11
+
Allen Wittenauer 2011-05-16, 17:19
+
Eli Collins 2011-05-16, 21:09
+
Allen Wittenauer 2011-05-16, 21:25
+
Eli Collins 2011-05-16, 21:29
+
Allen Wittenauer 2011-05-16, 21:42
+
Ian Holsman 2011-05-16, 21:59
+
Konstantin Boudnik 2011-05-17, 01:52
+
Matthew Foley 2011-05-16, 21:17
+
Segel, Mike 2011-05-17, 00:40
+
Scott Carey 2011-05-17, 01:12
+
Segel, Mike 2011-05-17, 01:50
+
Eric Baldeschwieler 2011-05-17, 02:32
+
Andrew Purtell 2011-05-17, 02:52
+
Matthew Foley 2011-05-17, 09:19
+
Segel, Mike 2011-05-17, 12:52
+
Doug Cutting 2011-05-17, 13:24
+
Matthew Foley 2011-05-17, 17:53
+
Doug Cutting 2011-05-18, 13:20
+
Roy T. Fielding 2011-05-13, 22:26
+
Eric Baldeschwieler 2011-05-16, 05:34
+
Steve Loughran 2011-05-16, 11:20
Copy link to this message
-
Re: Defining Hadoop Compatibility -revisiting-
Agree.

On May 12, 2011, at 11:16 PM, Doug Cutting wrote:

> Certification semms like mission creep.  Our mission is to produce
> open-source software.  If we wish to produce testing software, that
> seems fine.  But running a certification program for non-open-source
> software seems like a different task.
>
> The Hadoop mark should only be used to refer to open-source software
> produced by the ASF.  If other folks wish to make factual statements
> concerning our software, e.g., that their proprietary software passes
> tests that we've created, that may be fine, but I don't think we  
> should
> validate those claims by granting certifications to institutions.  
> That
> ventures outside the mission of the ASF.  We are not an accrediting
> organization.
>
> Doug
>
> On 05/10/2011 12:29 PM, Steve Loughran wrote:
>>
>> Back in Jan 2011, I started a discussion about how to define Apache
>> Hadoop Compatibility:
>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%[EMAIL PROTECTED]%3E
>>
>>
>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet
>>
>> http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.pdf
>>
>>
>> It claims that their implementations are 100% compatible, even though
>> the Enterprise edition uses a C filesystem. It also claims that both
>> their software releases contain "Certified Stacks", without defining
>> what Certified means, or who does the certification -only that it  
>> is an
>> improvement.
>>
>>
>> I think we should revisit this issue before people with their own
>> agendas define what compatibility with Apache Hadoop is for us
>>
>>
>> Licensing
>> -Use of the Hadoop codebase must follow the Apache License
>> http://www.apache.org/licenses/LICENSE-2.0
>> -plug in components that are dynamically linked to (Filesystems and
>> schedulers) don't appear to be derivative works on my reading of  
>> this,
>>
>> Naming
>> -this is something for branding@apache, they will have their  
>> opinions.
>> The key one is that the name "Apache Hadoop" must get used, and it's
>> important to make clear it is a derivative work.
>> -I don't think you can claim to have a Distribution/Fork/Version of
>> Apache Hadoop if you swap out big chunks of it for alternate
>> filesystems, MR engines, etc. Some description of this is needed
>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem  
>> XYZ"
>>
>> Compatibility
>> -the definition of the Hadoop interfaces and classes is the Apache
>> Source tree,
>> -the definition of semantics of the Hadoop interfaces and classes is
>> the Apache Source tree, including the test classes.
>> -the verification that the actual semantics of an Apache Hadoop  
>> release
>> is compatible with the expected semantics is that current and future
>> tests pass
>> -bug reports can highlight incompatibility with expectations of
>> community users, and once incorporated into tests form part of the
>> compatibility testing
>> -vendors can claim and even certify their derivative works as
>> compatible with other versions of their derivative works, but cannot
>> claim compatibility with Apache Hadoop unless their code passes the
>> tests and is consistent with the bug reports marked as ("by design").
>> Perhaps we should have tests that verify each of these "by design"
>> bugreps to make them more formal.
>>
>> Certification
>> -I have no idea what this means in EMC's case, they just say  
>> "Certified"
>> -As we don't do any certification ourselves, it would seem impossible
>> for us to certify that any derivative work is compatible.
>> -It may be best to state that nobody can certify their derivative as
>> "compatible with Apache Hadoop" unless it passes all current test  
>> suites
>> -And require that anyone who declares compatibility define what they
>> mean by this
>>
>> This is a good argument for getting more functional tests out there
>> -whoever has more functional tests needs to get them into a test  
>> module
+
Steve Loughran 2011-05-24, 16:23
+
Owen OMalley 2011-05-31, 22:08
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB