Steve Loughran 2013-02-11, 21:20
Eli Collins 2013-02-11, 21:36
Steve Loughran 2013-02-12, 08:55
Eli Collins 2013-02-12, 21:35
Steve Loughran 2013-02-12, 21:51
Eli Collins 2013-02-12, 22:09
Steve Loughran 2013-02-13, 09:44
-Re: where do side-projects go in trunk now that contrib/ is gone?
I like the idea of testing all FS for expected behavior, in HttpFS we are
already doing something along these lines testing HttpFS against HDFS and
LocalFS. Also testing 2 WebHDFS clients.
Regarding where these 'extensions' would go, well, we could have something
like share/hadoop/common/filesystem-ext/s3 and whoever wants to use s3
would have to symlink those JARs into common/lib. Or having a way to
activate via a HADOOP_COMMON_FS_EXT env which extension JARs to pick up. I
guess the BigTop guys could help defining this magic.
On Wed, Feb 13, 2013 at 1:44 AM, Steve Loughran <[EMAIL PROTECTED]>wrote:
> On 12 February 2013 22:09, Eli Collins <[EMAIL PROTECTED]> wrote:
> > I agree that the current place isn't a good one, for both the reasons
> > you mention on the jira (and because the people maintaining this code
> > don't primarily work on Hadoop). IMO the SwiftFS driver should live in
> > the swift source tree (as part of open stack).
> If they could be persuaded to move beyond .py, it'd be tempting -because
> the FileSystem API is nominally stable.
> However, one thing I have noticed during this work is how the behaviour of
> FileSystem is underspecified -that's not an issue for HDFS, which gets
> stressed rigorously during the hdfs and mapred test runs, but it does
> matter for the rest.
> There's a lot of assumptions "files!=directories", mv / anything fails, and
> things that aren't tested (mv self self) returns true if self is file,
> false if a directory, what exception to raise if readFully goes past the
> end of a file (and the answer is?).
> We even make an implicit assumption that file operations are consistent:
> you get back what you wrote, which turns out to be an assumption not
> guaranteed by any of the blobstores in all circumstances.
> HADOOP-9258, HADOOP-9119 tighten the spec a bit, but if you look at what
> I've been doing for Swift testing, I've created a set of test suites, one
> per operation "ls", "read", "rename", with tests for scale, directory depth
> and width on my todo list:
> Then I want to extract those into tests that can be applied to all
> filesystems (say in o.a.g.fs.contract), with some per-FS metadata file
> providing details on what the FS supports (rename, append, case
> sensitivity, MAX_PATH, ...), so that we've got better test coverage (&
> being Junit4, you can skip tests in-code by throwing
> AssumptionViolatedExceptions; these get reported as skips), test coverage
> that can be applied to all the filesystems in the hadoop codebase.
> It's this expanded test coverage that will be the tightest coupling to
> > I'm not -1 on it living in-tree, it's just not my 1st choice. If you
> > want to create a top-level directory for 3rd party (read non-local,
> > non-hdfs file systems) file systems - go for it. It would be an
> > improvement on the current situation (o.a.h.fs.ftp also brings in
> > dependencies that most people don't need). I don't think we need to
> > come up with a new top-level "kitchen sink" directory to handle all
> > Hadoop extensions, there are a few well-defined extension points that
> > can likely be handled independently so logically grouping them
> > separately makes sense to me (and perhaps we'll decide some extensions
> > are better in-tree and some not).
> Makes sense. That I will do in a JIRA
Steve Loughran 2013-02-14, 14:05
Eric Baldeschwieler 2013-03-01, 05:02
Steve Loughran 2013-03-08, 14:43
Alejandro Abdelnur 2013-03-08, 16:15
Steve Loughran 2013-03-08, 16:57
Alejandro Abdelnur 2013-03-08, 17:07
Alejandro Abdelnur 2013-03-08, 18:47
Steve Loughran 2013-03-09, 11:36
Alejandro Abdelnur 2013-03-11, 19:15