Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - Defining Hadoop Compatibility -revisiting-


Copy link to this message
-
Re: Defining Hadoop Compatibility -revisiting-
Konstantin Boudnik 2011-05-13, 17:47
On Fri, May 13, 2011 at 00:11, Milind Bhandarkar
<[EMAIL PROTECTED]> wrote:
> Cos,
>
> I remember the issues about the "inter-component interactions" at that
> point when you were part of the Yahoo Hadoop FIT team (I was on the other
> side of the same floor, remember ? ;-)

Vaguely ;) Of course I remember. But I prefer not to mentioned any
internal technologies developed for private companies after getting
lashes for that.

> Things like, "Can Pig take full URIs as input, and so works with viewfs",
> "Can Local jobtracker still use HDFS as input and output", "Can Oozie use
> local file system to keep workflows, while the jars were located on hdfs"
> etc came up often.
>
> Each of these issues were component-interaction issues, and were results
> of making DistributedFileSystem a public class, or some subtle dependency
> on the semantics of a particular method in an interface, which were not
> explicit in the syntax.
>
> That's an issue with interface-compatibility, and so merely compiling
> against a particular interface is not a solution. One needs a test-suite.

One needs more than a mere test-suite if experience teaches us
anything. FIT and its continuation turns to be a complex program (not
only in a sense of computer code) with many moving parts, bells and
whistles. One of those was a set of specs actually written in English
language. The downside is that someone needs to keep them up to day,
translate them into test cases or teach others how to do it, etc. That
exactly why TCK was using a test generator and used somewhat
formalized spec language.

Cos

> (With annotations in Java, one can impose more semantic restrictions on
> the interface, that can be automatically checked against at runtime. But
> is limited to individual methods, or the full class. Code generation using
> perl or whatever is similar in capability.)
>
> - milind
> --
> Milind Bhandarkar
> [EMAIL PROTECTED]
> +1-650-776-3167
>
>
>
>
>
>
> On 5/12/11 11:24 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:
>
>>On Thu, May 12, 2011 at 20:40, Milind Bhandarkar
>><[EMAIL PROTECTED]> wrote:
>>> Cos,
>>>
>>> Can you give me an example of a "system test" that is not a functional
>>> test ? My assumption was that the functionality being tested is specific
>>> to a component, and that inter-component interactions (that's what you
>>> meant, right?) would be taken care by the public interface and semantics
>>> of a component API.
>>
>>Milind, kinda... However, to exercise inter-component interactions via
>>component APIs one needs to have tests which are beyond functional or
>>component realm (e.g. system). At some point  I was part of a team
>>working on integration validation framework for Hadoop (FIT) which was
>>addressing inter-component interaction validations essentially
>>guaranteeing their compatibility. Components being Hadoop, Pig, Oozie,
>>etc. - thus massaging the whole stack of application and covering a
>>lot of use cases.
>>
>>Having a framework like this and a set of test cases available for
>>Hadoop community is a great benefit because one can quickly make sure
>>that a Hadoop stack built from a set of components is working
>>property. Another use case is to run the same set of tests - versioned
>>separately from the product itself - against previous and a next
>>release validating their compatibility at the functional level (sorta
>>what you have mentioned).
>>
>>This doesn't by the way deploy if we'd choose to work on HCK or not,
>>however HCK might be eventually based on top of such a framework.
>>
>>Cos
>>
>>> - milind
>>>
>>> --
>>> Milind Bhandarkar
>>> [EMAIL PROTECTED]
>>> +1-650-776-3167
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 5/12/11 3:30 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:
>>>
>>>>On Thu, May 12, 2011 at 09:45, Milind Bhandarkar
>>>><[EMAIL PROTECTED]> wrote:
>>>>> HCK and written specifications are not mutually exclusive. However,
>>>>>given
>>>>> the evolving nature of Hadoop APIs, functional tests need to evolve as