Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Defining Hadoop Compatibility -revisiting-


Copy link to this message
-
Re: Defining Hadoop Compatibility -revisiting-
On Thu, May 12, 2011 at 20:40, Milind Bhandarkar
<[EMAIL PROTECTED]> wrote:
> Cos,
>
> Can you give me an example of a "system test" that is not a functional
> test ? My assumption was that the functionality being tested is specific
> to a component, and that inter-component interactions (that's what you
> meant, right?) would be taken care by the public interface and semantics
> of a component API.

Milind, kinda... However, to exercise inter-component interactions via
component APIs one needs to have tests which are beyond functional or
component realm (e.g. system). At some point  I was part of a team
working on integration validation framework for Hadoop (FIT) which was
addressing inter-component interaction validations essentially
guaranteeing their compatibility. Components being Hadoop, Pig, Oozie,
etc. - thus massaging the whole stack of application and covering a
lot of use cases.

Having a framework like this and a set of test cases available for
Hadoop community is a great benefit because one can quickly make sure
that a Hadoop stack built from a set of components is working
property. Another use case is to run the same set of tests - versioned
separately from the product itself - against previous and a next
release validating their compatibility at the functional level (sorta
what you have mentioned).

This doesn't by the way deploy if we'd choose to work on HCK or not,
however HCK might be eventually based on top of such a framework.

Cos

> - milind
>
> --
> Milind Bhandarkar
> [EMAIL PROTECTED]
> +1-650-776-3167
>
>
>
>
>
>
> On 5/12/11 3:30 PM, "Konstantin Boudnik" <[EMAIL PROTECTED]> wrote:
>
>>On Thu, May 12, 2011 at 09:45, Milind Bhandarkar
>><[EMAIL PROTECTED]> wrote:
>>> HCK and written specifications are not mutually exclusive. However,
>>>given
>>> the evolving nature of Hadoop APIs, functional tests need to evolve as
>>
>>I would actually expand it to 'functional and system tests' because
>>latter are capable of validating inter-component iterations not
>>coverable by functional tests.
>>
>>Cos
>>
>>> well, and having them tied to a "current stable" version is easier to do
>>> than it is to tie the written specifications.
>>>
>>> - milind
>>>
>>> --
>>> Milind Bhandarkar
>>> [EMAIL PROTECTED]
>>> +1-650-776-3167
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 5/11/11 7:26 PM, "M. C. Srivas" <[EMAIL PROTECTED]> wrote:
>>>
>>>>While the HCK is a great idea to check quickly if an implementation is
>>>>"compliant",  we still need a written specification to define what is
>>>>meant
>>>>by compliance, something akin to a set of RFC's, or a set of docs like
>>>>the
>>>> IEEE POSIX specifications.
>>>>
>>>>For example, the POSIX.1c pthreads API has a written document that
>>>>specifies
>>>>all the function calls, input params, return values, and error codes. It
>>>>clearly indicates what any POSIX-complaint threads package needs to
>>>>support,
>>>>and what are vendor-specific non-portable extensions that one can use at
>>>>one's own risk.
>>>>
>>>>Currently we have 2 sets of API  in the DFS and Map/Reduce layers, and
>>>>the
>>>>specification is extracted only by looking at the code, or (where the
>>>>code
>>>>is non-trivial) by writing really bizarre test programs to examine
>>>>corner
>>>>cases. Further, the interaction between a mix of the old and new APIs is
>>>>not
>>>>specified anywhere. Such specifications are vitally important when
>>>>implementing libraries like Cascading, Mahout, etc. For example, an
>>>>application might open a file using the new API, and pass that stream
>>>>into a
>>>>library that manipulates the stream using some of the old API ... what
>>>>is
>>>>then the expectation of the state of the stream when the library call
>>>>returns?
>>>>
>>>>Sanjay Radia @ Y! already started specifying some the DFS APIs to nail
>>>>such
>>>>things down. There's similar good effort in the Map/Reduce and  Avro
>>>>spaces,
>>>>but it seems to have stalled somewhat. We should continue it.
>>>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB