Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # dev - Heads up - Snapshots feature merge into trunk

Suresh Srinivas 2013-04-13, 01:05
Aaron T. Myers 2013-04-18, 01:45
Tsz Wo Sze 2013-04-18, 18:48
Aaron T. Myers 2013-04-18, 20:49
Tsz Wo Sze 2013-04-18, 21:53
Aaron T. Myers 2013-04-18, 22:06
Todd Lipcon 2013-04-24, 20:25
Copy link to this message
Re: Heads up - Snapshots feature merge into trunk
Suresh Srinivas 2013-04-24, 21:00
I think we should take this on the jira than the merge heads up thread.
Nicholas, please suggest a jira where we can continue the

Some comments inline:
On Wed, Apr 24, 2013 at 1:25 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:

> On Fri, Apr 19, 2013 at 3:36 AM, Aaron T. Myers <[EMAIL PROTECTED]> wrote:
> > On Fri, Apr 19, 2013 at 6:53 AM, Tsz Wo Sze <[EMAIL PROTECTED]> wrote:
> >
> > > HdfsAdmin is also for admin operations.  However, createSnapshot etc
> > > methods aren't.
> > >
> >
> > I agree that they're not administrative operations in the sense that they
> > don't strictly require super user privilege, but they are
> "administrative"
> > in the sense that they will most-often be used by those administering
> > The HdfsAdmin class should not be construed to contain only operations
> > which require super user privilege, even though that happens to be the
> case
> > right now. It's intended as just a public API for HDFS-specific
> operations.
> >

I have to disagree about adding this functionality to HdfsAdmin. HdfsAdmin
class is for admin operations. As Nicholas has said, the snapshot operations
 are nothing different from mkdir, create file kind of operations.
> > Regardless, my point is not necessarily that these operations should go
> > into the HdfsAdmin class, but rather that they shouldn't go into the
> > FileSystem class, since the snapshots API doesn't seem to me like it will
> > generalize to other FileSystem implementations.
> >
> >
> Agreed. The cases of WAFL/ZFS were brought up -- in those file systems,
> even if users may take snapshots, they're done using FS-specific APIs
> rather than any standard Linux interface. So, I'm in favor of either
> putting the APIs in HdfsAdmin, or alternatively in DistributedFileSystem,
> forcing a user to down-cast if they want to use the HDFS-specific
> operation.
I have hard time understanding the issue related to adding these methods to
 FileSystem API. I think we already have many operations, one might
argue does not belong to generic file system such as getting block size,
file checksum, operations to copy from local, or copy to local, getting
replication etc. These are operations that are largely influenced by having
HDFS as the dominant implementation.

I also think there are other operations that are only in
should be moved down to FileSystem. Such as concat etc. I think it is
okay for the base FileSystem to throw unsupported exception for such
Current way of casting a FileSystem to a non public DistributedFileSystem is
not a good idea.

Other file system which support snapshot could implement these methods.
Implementing these methods does not mean, they have to use the same
snapshot path convention. They can document and provide their own convention
for supporting snapshot paths.