Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> What is the difference between URI, Home Directory, Working Directory in FileSystem.java or HDFS


+
Ling Kun 2013-04-11, 10:33
+
Daryn Sharp 2013-04-11, 14:53
Copy link to this message
-
Re: What is the difference between URI, Home Directory, Working Directory in FileSystem.java or HDFS
Dear  Daryn Sharp,
   Your reply helps me a lot for  code reading of the HDFS and FileSystem
interface.

   Thanks.

yours,
Ling Kun
On Thu, Apr 11, 2013 at 10:53 PM, Daryn Sharp <[EMAIL PROTECTED]> wrote:

> On Apr 11, 2013, at 5:33 AM, Ling Kun wrote:
>
> > Dear all,
> >    I am a little confusing about the URI, Home Directory and Working
> Directory in the FileSystem.java or HDFS.
> >
> >   I have listed my understanding about these concept, can someone please
> figure out whether I am correct?  Thanks.
> >
> >    The Home directory: This is usually a directory for a specific Hadoop
> users. And for the path, it is a user specific path. In HDFS, it is like
>  HDFS://NameNode:port/user/USERNAME.
>
> Correct.
>
> >    The URI: Is this the root of the distributed filesystem. for HDFS, it
> is just the HDFS://NameNode:port/ , each file/directory in the distributed
> filesystem is just a file or subdirectory in this path.
>
> Generally correct.  However, I'd strongly suggest avoiding the use of URIs
> directly.  It's better to obtain your filesystems via
> path.getFileSystem(conf) - it will extract the URI for the filesystem
> automatically.  See below for the correct definition of a Path.
>
> >    The working directory: I am a little confused about this variable. At
> a given time, there exists only one instance of the filesystem class, and
> the working dir is a private state of the FS. And during the job running,
> hadoop will switch among several dirs, and the working dir will be modified
> once it is switched. Like in the shared system dir, home dir, or
> input/output dir.
>
> Correct.
>
> >    Although I have looked through the related document, I am still a
> little confused about the java.net.URI,  java.io.File and
> org.apache.hadoop.fs.Path class. It seems URI could be
> hdfs://XXX/XXX/FILENAME, while Path only can be the path without the
> scheme, hostname and the port.  For the File class, it is just an object
> for a specific file.
>
> Your understanding of Path is incorrect.  Path is really just a veneer
> over a URI.  A Path can be qualified with a scheme/authority, or just be
> absolute or relative.  If a Path is not scheme qualified, it uses the
> defaultFS.  If the Path is not absolute, it's qualified against the working
> directory.  Path provides some niceties like not requiring percent encoding
> in the path portion of the URI, and allows use of glob chars and the
> quoting thereof.
>
> I hope this helps!
>
> Daryn
--
http://www.lingcc.com
+
Jay Vyas 2014-01-13, 19:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB