Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> relative symbolic links in HDFS


Copy link to this message
-
Re: relative symbolic links in HDFS
On Oct 31, 2011, at 1:45 PM, Eli Collins wrote:

> On Mon, Oct 31, 2011 at 9:27 AM, Charles Baker <[EMAIL PROTECTED]> wrote:
>> But even
>> if it did return the relative path, it seems counter-intuitive to me. I agree
>> with Daryn and expect the behavior of getFileLinkStatus() to return the
>> symlink as is and not presume that I wanted it qualified. If I wanted a
>> qualified path for a symlink, I would expect to call Path.makeQualified() to
>> do so.
>
> It does this because getFileStatus always returns fully qualified
> paths in HDFS, and we don't make to make callers check the type and
> care about the method that was used to obtain the FileStatus, eg to
> know whether it contains a fully qualified path or not.
>
> I think the original rationale for while FileStatus objects always
> have fully qualified paths is so they can be passed around w/o callers
> having to do future work to access them ie didn't want to disassociate
> the path from the file system it exists on. Note that in Hadoop
> "Paths" are actually URIs, vs file system paths (a subset of URIs).
>
> Regardless of the rationale, changing getFileStatus to return objects
> w/o fully qualified paths would break compatibility with a lot of
> existing programs. It would also hinder people porting to FileContext
> which tries to be consistent with FileSystem.
>
> Would a new method on FileStatus or Path that returns the unqualified
> version of the path (ie w/o the scheme and authority, and w/o
> resolving relative paths relative to the FileContext) work?  Ie the
> FileStatus could return the contents of the HdfsFileStatus w/o making
> it fully qualified.

Off the top of my head:  If we can't change the FileStatus behavior, I think FsShell would be well served by a Path#getRawUri() method that returned the exact uri/string used to  instantiate it.  As long as FileStatus preserves the path it's given, I think it would work well.  The various Path ctors would need to be modified to update the rawUri by tacking on directory components, or removing them.

I don't suppose it'd be ok for Path#toString() to return the stringified raw uri? :)

Daryn
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB