Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # dev - Re: Compatibility in Apache Hadoop


+
Steve Loughran 2013-04-23, 16:00
+
Karthik Kambatla 2013-04-23, 20:09
+
Andrew Purtell 2013-04-23, 18:32
+
Alejandro Abdelnur 2013-04-23, 18:50
Copy link to this message
-
Re: Compatibility in Apache Hadoop
Andrew Purtell 2013-04-23, 18:58
API jars only containing interfaces would work. They would be small and
lightweight, and not carry dependencies other than other API modules
presumably.

Some additional details of factoring out the codecs might be debatable
(e.g. native code loading) but I do feel it is a useful motivating example
for such API vs implementation separation.
On Tue, Apr 23, 2013 at 11:50 AM, Alejandro Abdelnur <[EMAIL PROTECTED]>wrote:

> Andrew,
>
> Or with a twist, why not break/consolidate things as follows?
>
> common API
> common IMPL
> hdfs CLIENT IMPL
> hdfs SERVER IMPL
> hdfs TOOLS
> <other filesystems> CLIENT
> yarn API
> yarn CLIENT IMPL
> yarn SERVER IMPL
> yarn TOOLS
> mapred API
> mapred IMPL
> mapred TOOLS
>
> IMO, this would help significantly to reduce dependency hell (like bringing
> servlet, jetty JAR to a hadoop client app).
>
> Thx
>
> On Tue, Apr 23, 2013 at 11:32 AM, Andrew Purtell <[EMAIL PROTECTED]
> >wrote:
>
> > At the risk of hijacking this conversation a bit, what do you think of
> the
> > notion of moving interfaces like Seekable and PositionedReadable into a
> new
> > foundational Maven module, perhaps just for such interfaces that define
> and
> > tag support for core semantics, as their details are better defined and
> > documented? I was involved in a discussion today considering factoring
> out
> > the codecs so other ecosystem projects might pull in only codec code.
> > Similar to how hadoop-auth is slender and has a useful servlet filter
> > implementing SPEGNO authentication, and so it is pulled into various
> > places, and can even be used with Hadoop 1. The only thing preventing a
> > clean separation of codecs like this is imports of Seekable and
> > PositionedReadable. But these define behavior, they don't implement it.
> >
> >
> > On Tue, Apr 23, 2013 at 9:00 AM, Steve Loughran <[EMAIL PROTECTED]
> > >wrote:
> >
> > > On 22 April 2013 18:32, Eli Collins <[EMAIL PROTECTED]> wrote:
> > >
> > > > On Mon, Apr 22, 2013 at 5:42 PM, Steve Loughran <
> > [EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > >
> > > > > There's a separate issue that says "we make some guarantee that the
> > > > > behaviour of a interface remains consistent over versions", which
> is
> > > hard
> > > > > to do without some rigorous definition of what the expected
> behaviour
> > > of
> > > > an
> > > > > implementation should be.
> > > >
> > > >
> > > > Good point, Steve.  I've assumed the semantics of the API had to
> > > > respect the attribute (eg changing the semantics of FileSystem#close
> > > > would be an incompatible change, since this is a public/stable API,
> > > > even if the new semantics are arguably better).  But you're right,
> > > > unless we've actually defined what the semantics of the APIs are it's
> > > > hard to say if we've materially changed them.  How about adding a new
> > > > section on the page and calling that out explicitly?
> > > >
> > >
> > > +1.
> > >
> > > Maybe we should list which bits we consider both well specified and
> > covered
> > > with tests that verify the implementations in our svn match that
> > > specification.
> > >
> > >
> > > >
> > > > In practice I think we'll have to take semantics case by case,
> clearly
> > > > define the semantics we care about better in the javadocs (for the
> > > > major end user-facing classes at least, calling out both intended
> > > > behavior and behavior that's meant to be undefined) and using
> > > > individual judgement elsewhere.  For example, HDFS-4156 changed
> > > > DataInputStream#seek to throw an IOE if you seek to a negative
> offset,
> > > > instead of succeeding then resulting in an NPE on the next access.
> > > >
> > >
> > > I'd seen that the DFS seek was the best implementation, but hadn't seen
> > the
> > > cause. The other ones (especially the Buffered one that goes in front
> of
> > > most others) is much weaker
> > >
> > >
> > > > That's an incompatible change in terms of semantics, but not
> semantics
> > > > intended by the author, or likely semantics programs depend on.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
+
Steve Loughran 2013-04-23, 18:44
+
Steve Loughran 2013-04-23, 18:26
+
Karthik Kambatla 2013-04-22, 21:00
+
Steve Loughran 2013-04-23, 00:42