-Re: Merging Namenode Federation feature (HDFS-1052) to trunk
Konstantin Shvachko 2011-03-15, 01:12
Dhruba, good you are speaking up for federation.
I consider it important as it means more support for the feature in the
The purpose of my reply was to get this discussion going, as I found Allens
question unanswered for 2 weeks.
The concern he has seems legitimate to me. If ops think federation will
"make running a grid much much harder" I want to know why and how much
Because cluster "manageability" is claimed as one of the objectives of
I sure am well familiar with the design being a part of it for a while.
And all my concerns have been articulated and well known. Though not all of
them are addressed.
The way I see it now, Federation introduces
- lots of code complexity to the system
- harder manageability, according to Allen
- potential performance degradation (tbd)
And the main question for those 95% of users, who don't run large clusters
don't want to place all their compute resources in one data center, is what
is the advantage in supporting it?
Performance-wise there 2 main aspects:
- Does federation give me the same cluster performance if I don't federate?
- If I federate how much more throughput can I get?
On Mon, Mar 14, 2011 at 10:43 AM, Dhruba Borthakur <[EMAIL PROTECTED]> wrote:
> Hi folks,
> The design for the federation work has been a published and there is a very
> well-written design document. It explains the pros-and-cons of each design
> point. It would be nice if more people can review this document and provide
> comments on how to make it better. The implementation is in progress but
> that does not mean that the
> Allen: can you pl describe what you mean by "It sounds like merging into
> trunk is extremely premature". If we can make all unit tests pass
> successfully on the branch, then do you think we should merge that branch
> into the trunk?
> Konstantin: I agree that federation introduces new code complexity. But it
> is a fact that introducing a new heavy-weight feature will add complexity.
> If you have a different proposal (and implementation) to scale namenode,
> please share it with us and we can then evaluate these designs in terms on
> complexity/feature. If you have questions about certain issues in the
> design, it would be great if you can ask them now. Hopefully, the folks
> doing the implementation can then provide you performance numbers to
> alleviate your concerns.
> From that way I look at it, I think the federation-feature is a huge
> positive step in the right direction.
> On Mon, Mar 14, 2011 at 10:28 AM, Konstantin Shvachko <
> [EMAIL PROTECTED]> wrote:
>> Allen is right.
>> This is a huge new feature with 86 jiras already filed, which
>> increases the complexity of the code base.
>> Having an in-depth motivation and benchmarking will be needed before the
>> community decides on adopting it for support.
>> On Sat, Mar 12, 2011 at 8:43 AM, Allen Wittenauer
>> <[EMAIL PROTECTED]>wrote:
>> > On Mar 3, 2011, at 2:41 PM, Suresh Srinivas wrote:
>> > > We have started pushing changes for namenode federation in to the
>> > branch HDFS-1052. The work items are created as subtask of the jira
>> > HDFS-1052 and are based on the design document published in the same
>> > By the end of this week, we will complete pushing the changes to
>> > branch. Though the changes in these jiras are already committed, please
>> > provide your feedback on either HDFS-1052 or its subtasks. New items
>> > come out of the feedback will be addressed in new jiras.
>> > >
>> > > Current status of the development:
>> > > # The testing of this feature is underway. Most of the basic
>> > functionality has been tested both for a single namenode cluster (for
>> > backward compatibility) and with multiple namenodes.
>> > > # All the existing tests and newly added tests pass (same as trunk).