Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> [DISCUSS] HCatalog becoming a subproject of Hive


Copy link to this message
-
Re: [DISCUSS] HCatalog becoming a subproject of Hive
On a functional level I don't think there is going to be much of a
difference between the subproject option proposed by Travis and the other
option where HCatalog becomes a TLP. In both cases HCatalog and Hive will
have separate committers, separate code repositories, separate release
cycles, and separate project roadmaps. Aside from ASF bureaucracy, I think
the only major difference between the two options is that the subproject
route will give the rest of the community the false impression that the two
projects have coordinated roadmaps and a process to prevent overlapping
functionality from appearing in both projects. Consequently, If these are
the only two options then I would prefer that HCatalog become a TLP.

On the other hand, I also agree with many of the sentiments that have
already been expressed in this thread, namely that the two projects are
closely related and that it would benefit the community at large if the two
projects could be brought closer together. Up to this point the major
source of pain for the HCatalog team has been the frequent necessity of
making changes on both the Hive and HCatalog sides when implementing new
features in HCatalog. This situation is compounded by the ASF requirement
that release artifacts may not depend on snapshot artifacts from other ASF
projects. Furthermore, if Hive adds a dependency on HCatalog then it will
be subject to these same problems (in addition to the gross circular
dependency!).

I think the best way to avoid these problems is for HCatalog to become a
Hive submodule. In this scenario HCatalog would exist as a subdirectory in
the Hive repository and would be distributed as a Hive artifact in future
Hive releases. In addition to solving the problems I mentioned earlier, I
think this would also help to assuage the concerns of many Hive committers
who don't want to see the MetaStore split out into a separate project.

Thanks.

Carl

On Thu, Dec 13, 2012 at 7:59 PM, Namit Jain <[EMAIL PROTECTED]> wrote:

> I am fine with this. Any hive committers who wants to volunteer to be
> a hcat shepherd is welcome.
>
>
>
> On 12/14/12 7:01 AM, "Travis Crawford" <[EMAIL PROTECTED]> wrote:
>
> >Thanks for reviving this thread. Reviewing the comments everyone seems
> >to agree HCatalog makes sense as a Hive subproject. I think that's
> >great news for the Hadoop community.
> >
> >The discussion seems to have turned to one of committer permissions. I
> >agree with the Hive folks sentiment that its something that must be
> >earned. That said, I've found it challenging at times getting patches
> >into Hive that would help earn taking on a hive committer
> >responsibility.
> >
> >Proposal: if a couple hive committers can volunteer to be hcat
> >shepherds, we can work with the shepherds when making hive changes in
> >a timely manor. Conversely, we can help shepherd any hive committers
> >who are interested in working more with hcat. There are certainly
> >benefits to cross-committership, and this approach could help each
> >other build a history of meaningful contributions and earn the
> >privilege & responsibility of being committers.
> >
> >Thoughts?
> >
> >--travis
> >
> >
> >
> >On Thu, Dec 13, 2012 at 11:59 AM, Edward Capriolo <[EMAIL PROTECTED]>
> >wrote:
> >> I initially was a hesitant of hcatalog mostly because I imagined we
> >>would
> >> end up in a spot very similar to this.
> >>
> >> Namely the hcatlog folks are interested in making a metastore to support
> >> pig, hive, and map reduce. However I get the impression that many in
> >>hive
> >> do not care much to have a metastore that caters to everyone. Their
> >>needs
> >> are only based on what hive needs. Which I believe is the wrong way to
> >>look
> >> at this situation.
> >>
> >> I though to reply to this thread because I have been following this
> >>Jira:
> >> https://issues.apache.org/jira/browse/HIVE-3752
> >>
> >> On a high level I do not like this duplication of effort and code. If
> >>hive
> >> is compatible with hcatalog I do not see why we put off merging the two