Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> [Proposal] Pluggable Namespace

Copy link to this message
Re: [Proposal] Pluggable Namespace
Thank you for sending out these notes, Milind.

> Current HDFS design is such that FSNameSystem is baked into even high
> interfaces, this is a major hurdle in cleanly implementing pluggable name
> systems. We aim to propose a change in such interfaces into which
> FSNameSystem is tightly coupled.

There is also another interface for us to consider: the end-user/operator
interface.  I see that you've made changes to the JSP pages, but I'm also
curious about the CLI.  Many of the current "hdfs dfsadmin" commands are
tightly coupled to our current in-memory representation backed by
persistence to fsimage + edits, either via FileJournalManager or
QuorumJournalManager.  It seems unavoidable that namespace administration
must be tightly coupled to the namespace implementation, so I'm curious if
your design also has considered pluggable namespace administration commands.

You mentioned the upgrade path from file-based to key-value-store-based
(and vice versa for rollback).  Does this involve refactoring the
upgrade/rollback code so that pluggable implementations can provide their
own upgrade implementations?  I imagine the challenge here is avoiding a
combinatorial explosion such that every transition from one implementation
to another is a separate code path or separate class.  A suitable
intermediate representation would avoid this.  I'm not certain if the
current FSNamesystem and its internal data structures are sufficient.

Another point to consider is that pluggability would put a new requirement
for backwards-compatibility on the namesystem interfaces.  Traditionally,
we've treated this as private implementation code that we can change
freely, as long as we also provide upgrade code to handle translation from
a prior layout version.

Chris Nauroth

On Mon, Oct 7, 2013 at 8:52 AM, Bobby Evans <[EMAIL PROTECTED]> wrote:

> Putting all conspiracy theories aside :).  Any way we decided to scale the
> name node is going to have limitations.  Federation currently has the
> problem that we cannot easily move data between different name nodes.  It
> is a static partitioning. It is not a blocker, but it can be annoying.  We
> can fix this, but to do so would require some sophisticated coordination
> between the name nodes involved.  If we put the namespace in a key/value
> store like Hbase there are likely to be mapping issues between a tree
> structure and a flat structure making some use cases, like very deep
> trees, potentially a lot slower.  It also does not scale the maximum
> number of operations per second a file system can do.  Because each has
> advantages and drawbacks it is important for us to enabled different use
> cases. This will allow for experimentation and parallel development and
> testing of new namespaces.  I though this was the original vision of
> federation.  Something where /tmp and /archive both co-exist together, but
> potentially have very different implementations to optimize for different
> use cases.
> Vinod,
> Yes block management has been separated out.  This is not about that, it
> is about providing a clean plugin point where someone can more easily take
> advantage of not just the block management code, but also the RPC and
> client code.
> --Bobby
> On 10/6/13 10:04 PM, "Mahadev Konar" <[EMAIL PROTECTED]> wrote:
> >Milind,
> > Am I missing something here? This was supposed to be a discussion and am
> >hoping thats why you started the thread. I don't see anywhere any
> >conspiracy theory being considered or being talked about. Vinod asked
> >some questions, if you can't or do not want to respond I suggest you skip
> >emailing or ignore rather than making false assumptions and accusations.
> >I hope the intent here is to contribute code and stays that way.
> >
> >thanks
> >mahadev
> >
> >On Oct 6, 2013, at 5:58 PM, Milind Bhandarkar <[EMAIL PROTECTED]>
> >wrote:
> >
> >> Vinod,
> >>
> >> I have received a few emails about concerns that this effort somehow

NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.