Namespace is mainly for Namenode scalability. If someone copies file to
another namespace, then essentially they would be creating 6 copies of same
To achieve file name redundancy, it is better to have NameNode HA, instead
of copying it to another namespace. Since Datanodes serve blocks to
multiple namespace, locality is not an issue and copying file to another
namespace would not buy you much.
2013/5/15 Michael Segel <[EMAIL PROTECTED]>
> On the one hand, I'm trying to understand why one would break a cluster in
> to multiple name spaces.
> (Obviously this gets back to managing very large clusters.)
> On the other. Why would someone want to have a copy of a file in two
> different name spaces?
> I'm making an assumption that when we have 3x replication that the
> replicas don't cross name space boundaries. (Is this correct?)
> My take is that one would copy a file to a second name space because they
> want a physical copy in both name spaces for redundancy in case a name
> space goes down. They would do this only for mission critical files, or if
> the data is being shared by two different groups who want their own copy of
> the data and they work solely within a single name space.
> The reason I am asking is that I'm trying to see how people view and use
> Does that make sense?
> On May 15, 2013, at 9:24 AM, Lohit <[EMAIL PROTECTED]> wrote:
> > On May 15, 2013, at 7:17 AM, Michael Segel <[EMAIL PROTECTED]>
> >> Quick question...
> >> So when we have a cluster which has multiple namespaces (multiple name
> nodes) , why would you have a file in two different namespaces?
> > Are you saying why one would create same file in two namespace? Or are
> you saying is there an option to have only one file but in two namespace?
> > Could you rephrase or give more information
Have a Nice Day!