Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> hdfs project separation

Copy link to this message
RE: hdfs project separation

Your understanding is correct. But I would like to start with a less
ambitious plan first. By duplicating common rpc etc code, and renaming
packages in common, we can independently build two different artifacts from
the same repo, one for hdfs and one for Yarn+MR. Then we can decide whether
we want to separate these projects completely, making independent releases.

I believe the last split of project failed because of common dependencies in
both MR and HDFS, which meant that changes to RPC etc were affecting both
upper level projects. I think we should avoid that, by duplicating needed
common code.

I would like to see what the community thinks, before making detailed plans.

- milind
-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]]
Sent: Friday, October 11, 2013 11:12 AM
Subject: Re: hdfs project separation

On Fri, Oct 11, 2013 at 9:14 AM, Milind Bhandarkar
> If HDFS is released independently, with its own RPC and protocol versions,
> features such as pluggable namespaces will not have to wait for the next
> mega-release of the entire stack.

The plan as I understand it is to eventually be able to release common/hdfs
& yarn/mr independently, as two, three or perhaps four different products.
Once we've got that down we can consider splitting into multiple TLPs.  For
this to transpire requires folks to volunteer to create an independent
release, establishing a plan, helping to make the required changes, calling
the vote, etc.  Someone could propose doing this first with HDFS, YARN or
whatever someone thinks is best.  It would take concerted effort by a few
folks, along with consent of the rest of the project.

Do you have a detailed plan?  If so, you could share it and start trying to
build consensus around it.