Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - [Proposal] Pluggable Namespace


Copy link to this message
-
Re: [Proposal] Pluggable Namespace
Milind Bhandarkar 2013-10-07, 17:11
Chris,

CLI is an issue that we had considered, but not in depth. My thinking was
that the dfsadmin commands that are relevant only to current FSNamesystem
can be separated out later from commands that are applicable to all
Namesystem implementations, maybe with a separate nsadmin command group.
However, this will be an incompatible change. For now, the alternate
namespace implementations can throw a NotSupportedException, while keeping
the current behavior intact.

Regarding upgrade/rollback, this is work in progress. After the initial
pluggability patch is published, we will need feedback to how best to
handle switching from one NS implementation to another.

The NS interfaces are currently private. However, we can make them as
LimitedPrivate with "extensions" as the project.

Thoughts ?

- Milind
---
Milind Bhandarkar
Chief Scientist
Pivotal
+1-650-523-3858 (W)
+1-408-666-8483 (C)
On Mon, Oct 7, 2013 at 9:44 AM, Chris Nauroth <[EMAIL PROTECTED]>wrote:

> Thank you for sending out these notes, Milind.
>
> > Current HDFS design is such that FSNameSystem is baked into even high
> level
> > interfaces, this is a major hurdle in cleanly implementing pluggable name
> > systems. We aim to propose a change in such interfaces into which
> > FSNameSystem is tightly coupled.
>
> There is also another interface for us to consider: the end-user/operator
> interface.  I see that you've made changes to the JSP pages, but I'm also
> curious about the CLI.  Many of the current "hdfs dfsadmin" commands are
> tightly coupled to our current in-memory representation backed by
> persistence to fsimage + edits, either via FileJournalManager or
> QuorumJournalManager.  It seems unavoidable that namespace administration
> must be tightly coupled to the namespace implementation, so I'm curious if
> your design also has considered pluggable namespace administration
> commands.
>
> You mentioned the upgrade path from file-based to key-value-store-based
> (and vice versa for rollback).  Does this involve refactoring the
> upgrade/rollback code so that pluggable implementations can provide their
> own upgrade implementations?  I imagine the challenge here is avoiding a
> combinatorial explosion such that every transition from one implementation
> to another is a separate code path or separate class.  A suitable
> intermediate representation would avoid this.  I'm not certain if the
> current FSNamesystem and its internal data structures are sufficient.
>
> Another point to consider is that pluggability would put a new requirement
> for backwards-compatibility on the namesystem interfaces.  Traditionally,
> we've treated this as private implementation code that we can change
> freely, as long as we also provide upgrade code to handle translation from
> a prior layout version.
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Mon, Oct 7, 2013 at 8:52 AM, Bobby Evans <[EMAIL PROTECTED]> wrote:
>
> > Putting all conspiracy theories aside :).  Any way we decided to scale
> the
> > name node is going to have limitations.  Federation currently has the
> > problem that we cannot easily move data between different name nodes.  It
> > is a static partitioning. It is not a blocker, but it can be annoying.
>  We
> > can fix this, but to do so would require some sophisticated coordination
> > between the name nodes involved.  If we put the namespace in a key/value
> > store like Hbase there are likely to be mapping issues between a tree
> > structure and a flat structure making some use cases, like very deep
> > trees, potentially a lot slower.  It also does not scale the maximum
> > number of operations per second a file system can do.  Because each has
> > advantages and drawbacks it is important for us to enabled different use
> > cases. This will allow for experimentation and parallel development and
> > testing of new namespaces.  I though this was the original vision of
> > federation.  Something where /tmp and /archive both co-exist together,