Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> [DISCUSS] HCatalog becoming a subproject of Hive

Copy link to this message
Re: [DISCUSS] HCatalog becoming a subproject of Hive
I initially was a hesitant of hcatalog mostly because I imagined we would
end up in a spot very similar to this.

Namely the hcatlog folks are interested in making a metastore to support
pig, hive, and map reduce. However I get the impression that many in hive
do not care much to have a metastore that caters to everyone. Their needs
are only based on what hive needs. Which I believe is the wrong way to look
at this situation.

I though to reply to this thread because I have been following this Jira:

On a high level I do not like this duplication of effort and code. If hive
is compatible with hcatalog I do not see why we put off merging the two at
all. Hive users would get an immediate benefit if Hive used hcatalog with
no apparent downside. Meanwhile we are putting this off and staying in this
awkward transition phase.

Personally, I do not have a problem being a hive committer and not having
hcatalog commit. None of the hive work I have done has ever touched the
metastore. Also of the thousands of jiras and features we have added only a
small portion require metastore changes.

As long as a couple active users have commit on hive and the suggested
hcatalog subproject I do not think not having commit will be a roadblock in
moving hive forward.
On Mon, Dec 3, 2012 at 6:22 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I am not sure where we are on this discussion.  So far those who have
> chimed in seemed generally positive (Namit, Edward, Clark, Alexander).
>  Namit and I have different visions for what the committership might look
> like, so I'd like to hear from other Hive PMC members what their view is on
> this.  I have to say from an HCatalog perspective the proposition is much
> less attractive without some commit rights.
> On a related note, people should be aware of these threads in the
> Incubator list:
> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w%40mail.gmail.com%3E
> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/%3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ%40mail.gmail.com%3E
> For those not inclined to read all the mails in the threads I will
> summarize (though I urge all PMC members of Hive and PPMC members of HCat
> to read both mail threads because this is highly relevant to what we are
> discussing).  There are two salient points in these threads:
> 1) It is not wise to build a subproject that is distinct from the main
> project in the sense that it has separate community members interested in
> it.  Bertrand, Arun, Chris Mattman, and Greg Stein all spoke against this,
> and all are long time Apache contributors with a lot of experience.  They
> were all of the opinion that it was reasonable for one project to release
> separate products.
> 2) It is not wise to have committers that have access to parts of a
> project but not others.  Greg and Bertrand argued (and Arun seemed to
> imply) that splitting up committer lists by sections of the code did not
> work out well.
> These insights cause me to question what we mean by subproject.  I had
> originally envisioned something that looked like Pig and Hive did when they
> were subprojects of Hadoop.  But this violates both 1 and 2 above.  Given
> this input from many of the "wise old timers" of Apache I think we should
> consider what we mean when we say subproject and how tightly we are willing
> to integrate these projects.  Personally I think it makes sense to continue
> to pursue integration, as I think HCat is really a set of interfaces on top
> of Hive and it makes sense to coalesce those into one project.  I guess
> this would mean HCat becomes just another set of jars that Hive releases
> when it releases, rather than a stand alone entity.  But I'm curious to
> hear what others think.
> Alan.
> On Nov 14, 2012, at 10:22 PM, Namit Jain wrote:
> > The same criteria should be applied to all Hive committers. Only a