On Mon, Mar 14, 2011 at 6:12 PM, Konstantin Shvachko
<[EMAIL PROTECTED]> wrote:
> Dhruba, good you are speaking up for federation.
> I consider it important as it means more support for the feature in the
> The purpose of my reply was to get this discussion going, as I found Allens
> question unanswered for 2 weeks.
> The concern he has seems legitimate to me. If ops think federation will
> "make running a grid much much harder" I want to know why and how much
> Because cluster "manageability" is claimed as one of the objectives of
> I sure am well familiar with the design being a part of it for a while.
> And all my concerns have been articulated and well known. Though not all of
> them are addressed.
> The way I see it now, Federation introduces
> - lots of code complexity to the system
> - harder manageability, according to Allen
> - potential performance degradation (tbd)
> And the main question for those 95% of users, who don't run large clusters
> don't want to place all their compute resources in one data center, is what
> is the advantage in supporting it?
> Performance-wise there 2 main aspects:
> - Does federation give me the same cluster performance if I don't federate?
> - If I federate how much more throughput can I get?
This reminds me of multi-cell GFS (discussed by Quinlan & McKusick at
http://bit.ly/einKMn). I used to run some of those clusters, and
compared to standard single-master clusters of course they were more
complex to manage. However, if you have apps needing that much master
capacity & that much shared read+write bandwidth across large pools of
storage nodes, its worth the trouble.
Assuming most people don't use federation it shouldn't add complexity
in the common case, but opens up some needed capabilities for large
sites. Stuff like datanode management would become more challenging in
a multi-master environment, but that's where automation comes in. If
you don't have teams building tools to manage your datacenter, its
likely you don't need federation either.
I'm currently running a handful of HDFS clusters & my overall reaction
to federation is "that's cool, but I probably won't need it for a few
years." Seems like the sort of thing the vast majority of sites won't
even encounter - you'd just add datanodes to one master & start using
> On Mon, Mar 14, 2011 at 10:43 AM, Dhruba Borthakur <[EMAIL PROTECTED]> wrote:
>> Hi folks,
>> The design for the federation work has been a published and there is a very
>> well-written design document. It explains the pros-and-cons of each design
>> point. It would be nice if more people can review this document and provide
>> comments on how to make it better. The implementation is in progress but
>> that does not mean that the
>> Allen: can you pl describe what you mean by "It sounds like merging into
>> trunk is extremely premature". If we can make all unit tests pass
>> successfully on the branch, then do you think we should merge that branch
>> into the trunk?
>> Konstantin: I agree that federation introduces new code complexity. But it
>> is a fact that introducing a new heavy-weight feature will add complexity.
>> If you have a different proposal (and implementation) to scale namenode,
>> please share it with us and we can then evaluate these designs in terms on
>> complexity/feature. If you have questions about certain issues in the
>> design, it would be great if you can ask them now. Hopefully, the folks
>> doing the implementation can then provide you performance numbers to
>> alleviate your concerns.
>> From that way I look at it, I think the federation-feature is a huge
>> positive step in the right direction.
>> On Mon, Mar 14, 2011 at 10:28 AM, Konstantin Shvachko <
>> [EMAIL PROTECTED]> wrote:
>>> Allen is right.
>>> This is a huge new feature with 86 jiras already filed, which