Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - symlink support in Hadoop 2 GA


Copy link to this message
-
Re: symlink support in Hadoop 2 GA
Eli Collins 2013-09-18, 16:24
On Wed, Sep 18, 2013 at 5:45 AM, Steve Loughran <[EMAIL PROTECTED]>wrote:

> On 18 September 2013 12:53, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
>
> > On Wed, Sep 18, 2013 at 11:29 AM, Steve Loughran <[EMAIL PROTECTED]
> > >wrote:
> >
> > > I'm reluctant for this as while delaying the release, because we are
> > going
> > > to find problems all the way up the stack -which will require a
> > > choreographed set of changes. Given the grief of the protbuf update, I
> > > don't want to go near that just before the final release.
> > >
> >
> > Well, I would use the exact same argument used for protobuf (which only
> > complication was getting protoc 2.5.0 in the jenkins boxes and
> communicate
> > developers to do the same, other than that we didn't hit any other issue
> > AFAIK) ...
> >
>
> protobuf was traumatic at build time, as I recall because it was neither
> forwards or backwards compatible. Those of us trying to build different
> branches had to choose which version to have on the path, or set up scripts
> to do the switching. HBase needed rebuilding, so did other things. And I
> still have the pain of downloading and installing protoc on all Linux VMs I
> build up going forward, until apt-get and yum have protoc 2.5 artifacts.
>
> This means it was very painful for developer, added a lot of late breaking
> pain to the developers, but it had one key feature that gave it an edge: it
> was immediately obvious where you had a problem as things didn't compile or
> classload without linkage problems. No latent bugs, unless protobuf 2.5 has
> them internally -for which we have to rely on google's release testing to
> have found.
>
> That is a lot simpler to regression test than adding any new feature to
> HDFS and seeing what breaks -as that is something that only surfaces out in
> the field. Which is why I think it's too late in the 2.1 release timetable
> to add symlinks. We've had a 2.1-beta out there, we've got feedback. Fix
> those problems that are show stoppers, but don't add more stuff. Which is
> precisely why I have not been pushing in any of my recent changes. I may
> seem ruthless arguing against symlinks -but I'm not being inconsistent with
> my own commit history. The only two things I've put in branch-2.1 since
> beta-1 were a separate log for the Configuration deprecation warnings and a
> patch to the POM for a java7 build on OSX: and they weren't even my
> patches.
>
>
> -Steve
>
> (One of these days I should volunteer to be the release manager and it'll
> be obvious that Arun is being quite amenable to all the other developers)
>
>
>
> >
> > IMO, it makes more sense to do this change during the beta rather than
> when
> > GA. That gives us more flexibility to iron out things if necessary.
> >
> >
> I'm arguing this change can go into the beta of the successor to 2.1 -not
> GA.
>
>
What does "this change" refer to?  Symlinks are already in 2.1, and the
existing semantics create problems for programs (eg see the pig
example in HADOOP-9912)
that we need to resolve.  I don't think do nothing is an option for 2.2. GA.

Thanks,
Eli

> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>