Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # general - Defining Hadoop Compatibility -revisiting-


+
Steve Loughran 2011-05-10, 10:29
+
Andrew Purtell 2011-05-11, 07:43
+
Steve Loughran 2011-05-11, 10:34
+
Eric Baldeschwieler 2011-05-11, 21:24
+
Milind Bhandarkar 2011-05-11, 21:46
+
M. C. Srivas 2011-05-12, 02:26
+
Ted Dunning 2011-05-12, 04:37
+
Steve Loughran 2011-05-12, 09:32
+
Segel, Mike 2011-05-12, 09:49
+
Eric Baldeschwieler 2011-05-13, 05:05
+
Milind Bhandarkar 2011-05-12, 16:45
+
Konstantin Boudnik 2011-05-12, 22:30
+
Milind Bhandarkar 2011-05-13, 03:40
+
Konstantin Boudnik 2011-05-13, 06:24
+
Milind Bhandarkar 2011-05-13, 07:11
+
Konstantin Boudnik 2011-05-13, 17:47
+
Ian Holsman 2011-05-11, 22:42
+
Jacob R Rideout 2011-05-11, 22:56
+
Aaron Kimball 2011-05-11, 23:20
+
Steve Loughran 2011-05-12, 09:33
+
Konstantin Boudnik 2011-05-12, 22:26
+
Milind Bhandarkar 2011-05-13, 03:37
+
Ted Dunning 2011-05-13, 04:05
+
Milind Bhandarkar 2011-05-13, 04:52
+
Ted Dunning 2011-05-13, 05:38
+
Konstantin Boudnik 2011-05-13, 06:12
+
Milind Bhandarkar 2011-05-13, 06:57
+
Eric Baldeschwieler 2011-05-16, 05:34
+
Steve Loughran 2011-05-16, 10:50
+
Steve Loughran 2011-05-12, 09:23
+
Allen Wittenauer 2011-05-12, 16:45
+
Doug Cutting 2011-05-13, 06:16
+
Milind Bhandarkar 2011-05-13, 07:24
+
Doug Cutting 2011-05-13, 08:53
+
Ted Dunning 2011-05-13, 13:43
+
Doug Cutting 2011-05-13, 14:50
+
Nathan Roberts 2011-05-13, 15:19
+
Allen Wittenauer 2011-05-13, 17:28
+
Segel, Mike 2011-05-13, 17:32
+
Doug Cutting 2011-05-13, 21:55
+
Allen Wittenauer 2011-05-13, 22:13
+
Doug Cutting 2011-05-13, 22:16
+
Allen Wittenauer 2011-05-13, 22:17
+
Doug Cutting 2011-05-13, 22:22
+
Steve Loughran 2011-05-16, 11:15
+
Eli Collins 2011-05-13, 22:18
+
Ted Dunning 2011-05-13, 22:53
+
Allen Wittenauer 2011-05-13, 22:57
+
Steve Loughran 2011-05-16, 11:01
+
Segel, Mike 2011-05-16, 12:00
+
Steve Loughran 2011-05-16, 14:11
+
Allen Wittenauer 2011-05-16, 17:19
+
Eli Collins 2011-05-16, 21:09
+
Allen Wittenauer 2011-05-16, 21:25
+
Eli Collins 2011-05-16, 21:29
+
Allen Wittenauer 2011-05-16, 21:42
+
Ian Holsman 2011-05-16, 21:59
+
Konstantin Boudnik 2011-05-17, 01:52
+
Matthew Foley 2011-05-16, 21:17
+
Segel, Mike 2011-05-17, 00:40
+
Scott Carey 2011-05-17, 01:12
+
Segel, Mike 2011-05-17, 01:50
+
Eric Baldeschwieler 2011-05-17, 02:32
+
Andrew Purtell 2011-05-17, 02:52
+
Matthew Foley 2011-05-17, 09:19
+
Segel, Mike 2011-05-17, 12:52
+
Doug Cutting 2011-05-17, 13:24
+
Matthew Foley 2011-05-17, 17:53
+
Doug Cutting 2011-05-18, 13:20
+
Roy T. Fielding 2011-05-13, 22:26
+
Eric Baldeschwieler 2011-05-16, 05:34
+
Steve Loughran 2011-05-16, 11:20
Copy link to this message
-
Re: Defining Hadoop Compatibility -revisiting-
Sanjay Radia 2011-05-23, 16:27
Agree.

On May 12, 2011, at 11:16 PM, Doug Cutting wrote:

> Certification semms like mission creep.  Our mission is to produce
> open-source software.  If we wish to produce testing software, that
> seems fine.  But running a certification program for non-open-source
> software seems like a different task.
>
> The Hadoop mark should only be used to refer to open-source software
> produced by the ASF.  If other folks wish to make factual statements
> concerning our software, e.g., that their proprietary software passes
> tests that we've created, that may be fine, but I don't think we  
> should
> validate those claims by granting certifications to institutions.  
> That
> ventures outside the mission of the ASF.  We are not an accrediting
> organization.
>
> Doug
>
> On 05/10/2011 12:29 PM, Steve Loughran wrote:
>>
>> Back in Jan 2011, I started a discussion about how to define Apache
>> Hadoop Compatibility:
>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201101.mbox/%[EMAIL PROTECTED]%3E
>>
>>
>> I am now reading EMC HD "Enterprise Ready" Apache Hadoop datasheet
>>
>> http://www.greenplum.com/sites/default/files/EMC_Greenplum_HD_DS_Final_1.pdf
>>
>>
>> It claims that their implementations are 100% compatible, even though
>> the Enterprise edition uses a C filesystem. It also claims that both
>> their software releases contain "Certified Stacks", without defining
>> what Certified means, or who does the certification -only that it  
>> is an
>> improvement.
>>
>>
>> I think we should revisit this issue before people with their own
>> agendas define what compatibility with Apache Hadoop is for us
>>
>>
>> Licensing
>> -Use of the Hadoop codebase must follow the Apache License
>> http://www.apache.org/licenses/LICENSE-2.0
>> -plug in components that are dynamically linked to (Filesystems and
>> schedulers) don't appear to be derivative works on my reading of  
>> this,
>>
>> Naming
>> -this is something for branding@apache, they will have their  
>> opinions.
>> The key one is that the name "Apache Hadoop" must get used, and it's
>> important to make clear it is a derivative work.
>> -I don't think you can claim to have a Distribution/Fork/Version of
>> Apache Hadoop if you swap out big chunks of it for alternate
>> filesystems, MR engines, etc. Some description of this is needed
>> "Supports the Apache Hadoop MapReduce engine on top of Filesystem  
>> XYZ"
>>
>> Compatibility
>> -the definition of the Hadoop interfaces and classes is the Apache
>> Source tree,
>> -the definition of semantics of the Hadoop interfaces and classes is
>> the Apache Source tree, including the test classes.
>> -the verification that the actual semantics of an Apache Hadoop  
>> release
>> is compatible with the expected semantics is that current and future
>> tests pass
>> -bug reports can highlight incompatibility with expectations of
>> community users, and once incorporated into tests form part of the
>> compatibility testing
>> -vendors can claim and even certify their derivative works as
>> compatible with other versions of their derivative works, but cannot
>> claim compatibility with Apache Hadoop unless their code passes the
>> tests and is consistent with the bug reports marked as ("by design").
>> Perhaps we should have tests that verify each of these "by design"
>> bugreps to make them more formal.
>>
>> Certification
>> -I have no idea what this means in EMC's case, they just say  
>> "Certified"
>> -As we don't do any certification ourselves, it would seem impossible
>> for us to certify that any derivative work is compatible.
>> -It may be best to state that nobody can certify their derivative as
>> "compatible with Apache Hadoop" unless it passes all current test  
>> suites
>> -And require that anyone who declares compatibility define what they
>> mean by this
>>
>> This is a good argument for getting more functional tests out there
>> -whoever has more functional tests needs to get them into a test  
>> module
+
Steve Loughran 2011-05-24, 16:23
+
Owen OMalley 2011-05-31, 22:08