Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - [Proposal] Pluggable Namespace

Copy link to this message
Re: [Proposal] Pluggable Namespace
Milind Bhandarkar 2013-10-08, 17:13
Thanks for all the feedback, folks.

I have created a jira: https://issues.apache.org/jira/browse/HDFS-5324.

Let us continue detailed discussions there.

- Milind
Milind Bhandarkar
Chief Scientist
+1-650-523-3858 (W)
+1-408-666-8483 (C)
On Mon, Oct 7, 2013 at 9:50 PM, sanjay Radia <[EMAIL PROTECTED]> wrote:

> On Oct 3, 2013, at 12:17 PM, Milind Bhandarkar wrote:
> > Exec Summary: For the last couple of months, we, at Pivotal, along with a
> > couple of folks in the community have been working on making Namespace
> > implementation in the namenode pluggable. We have demonstrated that it
> can
> > be done without major surgery on the namenode, and does not have
> noticeable
> > performance impact. We would like to contribute it back to Apache if
> there
> > is sufficient interest. Please let us know if you are interested, and we
> > will create a Jira and update the patch for in-progress work.
> > ……
> Milind,
> a reasonable idea - but best to discuss actual details in a jira.  Some
> initial thoughts, to clear some of the confusions, (and accusations) in
> this thread
> HDFS pluggability (and relation to pluggability added as part of
> Federation)
>  - Pluggabilty and federation are orthogonal, although we did improved the
> pluggabily of HDFS as part of federation implementation. As Vinod has noted
> the *block layer* was separated out as part of the federation work and
> hence makes the general development of new  of HDFS namespace
> implementations easier.  Federation's  pluggablity was  targeted towards
>  someone writing a new NN and reusing the block storage layer via a library
>   and optionally living side-by-side with different implementations of the
> NN within the same cluster. Hence we added notion of block pools and
> separated out the block management layer.
>  - So your proposed work is clearly not in conflict with Federation or
> even with the pluggability that Federation added, but philosophically,
>  your proposal is complementary.
> Considerations: A Public API?
> The FileSystem/AbstractFileSystem APIs and the newly proposed
> AbstractFSNamesystem are targeting very different kinds of plugability into
> Hadoop. The former takes a thin application API (FileSystem and
> FileContext) and makes it easy for users to plug in different filesytems
> (S3, LocalFS, etc) as Hadoop compatible filesystems. In contrast the later
> (the proposed AbstractFSNamesystem) is a fatter interface inside the depths
> of HDFS implementation and makes parts of the impl pluggable.
> I would  not make your proposed AbstractFSNamesystem a public stable
> Hadoop API but instead direct it towards to HDFS developers who want to
> extend the implementation of HDFS more easily. Were you envisioning the
> Abstract FSNamesystem to be a stable public Hadoop API? If someone has
> their own private implementation for this new abstract class, would  the
> HDFS community have the freedom to modify the abstract class in
> incompatible ways? These are discussions for the Jira.
> A somewhat related piece of work:
> Since Milind motivated his pluggbility by  a new NN implementation (that
> happens to use HBase), I will briefly mention an experiment for building a
> new NN that stores only a partial namespace in memory. The goal of this
> experiment was *not* making the NN code more pluggable, but instead to
> provide an alternate implementation of the NN; hence it is orthogonal.  A
> PhD student, who worked as an intern at Hortonworks implemented a NN that
> stores only partial namespace in RAM. She presented this to a HUG in Aug
> 2013 in sunnyvale. I have encouraged her to file a jira but she wants to
> finish some more experiments before filing, I will file a jira on her
> behalf and refer to her work in the next day or so.  It is a prototype that
> helps us understand how well the particular implementation choice for this
> alternate NN  works. It would be interesting to see if her code changes fit
> into Milind's newly proposed AbstractFSNamesystem. My initial view is that