Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # dev - Hadoop FileSystem Validation Workshop/Meetup - Red Hat in Mountain View on June 25th


Copy link to this message
-
Hadoop FileSystem Validation Workshop/Meetup - Red Hat in Mountain View on June 25th
Stephen Watt 2013-06-17, 22:50
Hi Folks

For those of you interested, the day before the Hadoop Summit we have a face to face workshop/meetup on Hadoop FileSystem Validation at Red Hat in Mountain View on June 25th from 10am - 3pm (lunch provided).

I need to make sure you all get visitor passes, and also to avoid exceeding the room capacity, so please sign up here - http://hadoop-fs.eventbrite.com/

Regards
Steve Watt

----- Original Message -----
From: "Andrew Wang" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: "Milind Bhandarkar" <[EMAIL PROTECTED]>, "shv hadoop" <[EMAIL PROTECTED]>, "Steve Loughran" <[EMAIL PROTECTED]>, "Kun Ling" <[EMAIL PROTECTED]>, "Roman Shaposhnik" <[EMAIL PROTECTED]>, "Andrew Purtell" <[EMAIL PROTECTED]>, [EMAIL PROTECTED], [EMAIL PROTECTED], "Sanjay Radia" <[EMAIL PROTECTED]>
Sent: Friday, June 14, 2013 1:32:38 PM
Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop FileSystems + Workshop

Hey Steve,

I agree that it's confusing. FileSystem and FileContext are essentially two
parallel sets of interfaces for accessing filesystems in Hadoop.
FileContext splits the interface and shared code with AbstractFileSystem,
while FileSystem is all-in-one. If you're looking for the AFS equivalents
to DistributedFileSystem and LocalFileSystem, see Hdfs and LocalFs.

Realistically, FileSystem isn't going to be deprecated and removed any time
soon. There are lots of 3rd-party FileSystem implementations, and most apps
today use FileSystem (including many HDFS internals, like trash and the
shell).

When I read the wiki page, I figured that the mention of AFS was
essentially a typo, since everyone's been steaming ahead with FileSystem.
Standardizing FileSystem makes total sense to me, I just wanted to confirm
that plan.

Best,
Andrew
On Fri, Jun 14, 2013 at 9:38 AM, Stephen Watt <[EMAIL PROTECTED]> wrote:

> This is a good point Andrew. The hangout was actually the first time I'd
> heard about the AbstractFileSystem class. I've been doing some further
> analysis on the source in Hadoop 2.0 and when I look at the Hadoop 2.0
> implementation of DistributedFileSystem and LocalFileSystem class they
> extend the FileSystem class and not AbstractFileSystem. I would imagine if
> the plan for Hadoop 2.0 is to build FileSystem implementations using the
> AbstractFileSystem, then those two would use it, so I'm a bit confused.
>
> Perhaps I'm looking in the wrong place? Sanjay (or anyone else), could you
> clarify this for us?
>
> Regards
> Steve Watt
>
> ----- Original Message -----
> From: "Andrew Wang" <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED], "shv hadoop" <[EMAIL PROTECTED]>,
> [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
> [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED],
> [EMAIL PROTECTED]
> Sent: Monday, June 10, 2013 5:14:16 PM
> Subject: Re: [DISCUSS] Ensuring Consistent Behavior for Alternative Hadoop
> FileSystems + Workshop
>
> Thanks for the summary Steve, very useful.
>
> I'm wondering a bit about the point on testing AbstractFileSystem rather
> than FileSystem. While these are both wrappers for DFSClient, they're
> pretty different in terms of the APIs they expose. Furthermore, AFS is not
> actually a client-facing API; clients interact with an AFS through
> FileContext.
>
> I ask because I did some work trying to unify the symlink tests for both
> FileContext and FileSystem (HADOOP-9370 and HADOOP-9355). Subtle things
> like the default mkdir semantics are different; you can see some of the
> contortions in HADOOP-9370. I ultimately ended up just adhering to the
> FileContext-style behavior, but as a result I'm not really testing some
> parts of FileSystem.
>
> Are we going to end up with two different sets of validation tests? Or just
> choose one API over the other? FileSystem is supposed to eventually be
> deprecated in favor of FileContext (HADOOP-6446, filed in 2009), but actual