Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> [Proposal] Pluggable Namespace


Copy link to this message
-
Re: [Proposal] Pluggable Namespace
Thanks for all the feedback, folks.

I have created a jira: https://issues.apache.org/jira/browse/HDFS-5324.

Let us continue detailed discussions there.

- Milind
---
Milind Bhandarkar
Chief Scientist
Pivotal
+1-650-523-3858 (W)
+1-408-666-8483 (C)
On Mon, Oct 7, 2013 at 9:50 PM, sanjay Radia <[EMAIL PROTECTED]> wrote:

>
> On Oct 3, 2013, at 12:17 PM, Milind Bhandarkar wrote:
>
> > Exec Summary: For the last couple of months, we, at Pivotal, along with a
> > couple of folks in the community have been working on making Namespace
> > implementation in the namenode pluggable. We have demonstrated that it
> can
> > be done without major surgery on the namenode, and does not have
> noticeable
> > performance impact. We would like to contribute it back to Apache if
> there
> > is sufficient interest. Please let us know if you are interested, and we
> > will create a Jira and update the patch for in-progress work.
> > ……
>
>
> Milind,
> a reasonable idea - but best to discuss actual details in a jira.  Some
> initial thoughts, to clear some of the confusions, (and accusations) in
> this thread
>
> HDFS pluggability (and relation to pluggability added as part of
> Federation)
>  - Pluggabilty and federation are orthogonal, although we did improved the
> pluggabily of HDFS as part of federation implementation. As Vinod has noted
> the *block layer* was separated out as part of the federation work and
> hence makes the general development of new  of HDFS namespace
> implementations easier.  Federation's  pluggablity was  targeted towards
>  someone writing a new NN and reusing the block storage layer via a library
>   and optionally living side-by-side with different implementations of the
> NN within the same cluster. Hence we added notion of block pools and
> separated out the block management layer.
>  - So your proposed work is clearly not in conflict with Federation or
> even with the pluggability that Federation added, but philosophically,
>  your proposal is complementary.
>
> Considerations: A Public API?
> The FileSystem/AbstractFileSystem APIs and the newly proposed
> AbstractFSNamesystem are targeting very different kinds of plugability into
> Hadoop. The former takes a thin application API (FileSystem and
> FileContext) and makes it easy for users to plug in different filesytems
> (S3, LocalFS, etc) as Hadoop compatible filesystems. In contrast the later
> (the proposed AbstractFSNamesystem) is a fatter interface inside the depths
> of HDFS implementation and makes parts of the impl pluggable.
>
> I would  not make your proposed AbstractFSNamesystem a public stable
> Hadoop API but instead direct it towards to HDFS developers who want to
> extend the implementation of HDFS more easily. Were you envisioning the
> Abstract FSNamesystem to be a stable public Hadoop API? If someone has
> their own private implementation for this new abstract class, would  the
> HDFS community have the freedom to modify the abstract class in
> incompatible ways? These are discussions for the Jira.
>
> A somewhat related piece of work:
> Since Milind motivated his pluggbility by  a new NN implementation (that
> happens to use HBase), I will briefly mention an experiment for building a
> new NN that stores only a partial namespace in memory. The goal of this
> experiment was *not* making the NN code more pluggable, but instead to
> provide an alternate implementation of the NN; hence it is orthogonal.  A
> PhD student, who worked as an intern at Hortonworks implemented a NN that
> stores only partial namespace in RAM. She presented this to a HUG in Aug
> 2013 in sunnyvale. I have encouraged her to file a jira but she wants to
> finish some more experiments before filing, I will file a jira on her
> behalf and refer to her work in the next day or so.  It is a prototype that
> helps us understand how well the particular implementation choice for this
> alternate NN  works. It would be interesting to see if her code changes fit
> into Milind's newly proposed AbstractFSNamesystem. My initial view is that
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB