Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - DistributedFileSystem.listStatus()  - Why does it do partial listings then assemble?


Copy link to this message
-
Re: DistributedFileSystem.listStatus() - Why does it do partial listings then assemble?
Todd Lipcon 2013-05-02, 16:28
Hi Brad,

The reasoning is that the NameNode locking is somewhat coarse grained. In
older versions of Hadoop, before it worked this way, we found that listing
large directories (eg with 100k+ files) could end up holding the namenode's
lock for a quite long period of time and starve other clients.

Additionally, I believe there is a second API that does the "on-demand"
fetching of the next set of files from the listing as well, no?

As for the consistency argument, you're correct that you may have a
non-atomic view of the directory contents, but I can't think of any
applications where this would be problematic.

-Todd

On Thu, May 2, 2013 at 9:18 AM, Brad Childs <[EMAIL PROTECTED]> wrote:

> Could someone explain why the DistributedFileSystem's listStatus() method
> does a piecemeal assembly of a directory listing within the method?
>
> Is there a locking issue? What if an element is added to the the directory
> during the operation?  What if elements are removed?
>
> It would make sense to me that the FileSystem class listStatus() method
> returned an Iterator allowing only partial fetching/chatter as needed.  But
> I dont understand why you'd want to assemble a giant array of the listing
> chunk by chunk.
>
>
> Here's the source of the listStatus() method, and I've linked the entire
> class below.
>
>
> ---------------------------------
>
>   public FileStatus[] listStatus(Path p) throws IOException {
>     String src = getPathName(p);
>
>     // fetch the first batch of entries in the directory
>     DirectoryListing thisListing = dfs.listPaths(
>         src, HdfsFileStatus.EMPTY_NAME);
>
>     if (thisListing == null) { // the directory does not exist
>       return null;
>     }
>
>     HdfsFileStatus[] partialListing = thisListing.getPartialListing();
>     if (!thisListing.hasMore()) { // got all entries of the directory
>       FileStatus[] stats = new FileStatus[partialListing.length];
>       for (int i = 0; i < partialListing.length; i++) {
>         stats[i] = makeQualified(partialListing[i], p);
>       }
>       statistics.incrementReadOps(1);
>       return stats;
>     }
>
>     // The directory size is too big that it needs to fetch more
>     // estimate the total number of entries in the directory
>     int totalNumEntries >       partialListing.length + thisListing.getRemainingEntries();
>     ArrayList<FileStatus> listing >       new ArrayList<FileStatus>(totalNumEntries);
>     // add the first batch of entries to the array list
>     for (HdfsFileStatus fileStatus : partialListing) {
>       listing.add(makeQualified(fileStatus, p));
>     }
>     statistics.incrementLargeReadOps(1);
>
>     // now fetch more entries
>     do {
>       thisListing = dfs.listPaths(src, thisListing.getLastName());
>
>       if (thisListing == null) {
>         return null; // the directory is deleted
>       }
>
>       partialListing = thisListing.getPartialListing();
>       for (HdfsFileStatus fileStatus : partialListing) {
>         listing.add(makeQualified(fileStatus, p));
>       }
>       statistics.incrementLargeReadOps(1);
>     } while (thisListing.hasMore());
>
>     return listing.toArray(new FileStatus[listing.size()]);
>   }
>
> --------------------------------------------
>
>
>
>
>
> Ref:
>
> https://svn.apache.org/repos/asf/hadoop/common/tags/release-1.0.4/src/hdfs/org/apache/hadoop/hdfs/DistributedFileSystem.java
> http://docs.oracle.com/javase/6/docs/api/java/util/Iterator.html
>
>
> thanks!
>
> -bc
>

--
Todd Lipcon
Software Engineer, Cloudera