Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # dev >> symlink support in Hadoop 2 GA

Copy link to this message
Re: symlink support in Hadoop 2 GA
I reluctantly agree that we should disable symlinks in 2.2 until we can sort out the compatibility issues.  I'm reluctant in the sense that its a feature users have long wanted, and it's something we'd like to use from an administrative view.  However I don't see all the issues being shorted out in the very near future.

I filed some jiras today that have led me to believe that the current implementation of fs symlinks is irreparably flawed.  Adding optional primitives to filesystems to make them symlink capable is ok.  However, adding symlink resolution to individual filesystems is fundamentally broken.  It doesn't work for stacked filesystems (viewfs, chroots, filters, etc) because the resolution must occur at the highest level, not within an individual filesystem itself.  Otherwise the abstraction of the top-level filesystem is violated and all kinds of unexpected behavior like walking out of chroots becomes possible.


On Oct 3, 2013, at 1:39 PM, sanjay Radia wrote:

> There are a number of issues (some minor, some more than minor).
> GA is close and we are are still in discussion on the some of them; while I believe we will close on these very very shortly, code change like this so close to GA is dangerous.
> I suggest we do the following:
> 1) Disable Symlinks  in 2.2 GA- throw unsupported exception on createSymlink in both FileSystem and FileContext.
> 2) Deal with the  isDir() in 2.2GA in preparation for item 3 coming after GA:
> a) Deprecate isDir()
>        b) Add a new API that returns an enum (see FileContext).
> 3) Fix Symlinks, in a future release, hopefully the very next one after 2.2GA
>   a)  change the stack to use the new API replacing isDir().
>   b) fix isDIr() to do something smarter (we can detail this later but there is a solution that has been discussed). This helps customer applications that call isDir().
>  c) Remove isDir in a future release when customers have had sufficient time to migrate.
> sanjay
> PS. J Rottinghuis expressed a similar sentiment in a previous email in this thread:
> On Sep 18, 2013, at 5:11 PM, J. Rottinghuis wrote:
>> I like symlink functionality, but in our migration to Hadoop 2.x this is a
>> total distraction. If the APIs stay in 2.2 GA we'll have to choose to:
>> a) Not uprev until symlink support is figured out up and down the stack,
>> and we've been able to migrate all our 1.x (equivalent) clusters to 2.x
>> (equivalent). Or
>> b) rip out the API altogether. Or
>> c) change the implementation to throw an UnsupportedOperationException
>> I'm not sure yet which of these I like least.
> --
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.