Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Merging HCatalog into Hive


Copy link to this message
-
Re: Merging HCatalog into Hive
Alright, I've gotten some feedback from Brock around the JIRA stuff and Carl in a live conversation expressed his desire to move hcat into the Hive namespace sooner rather than later.  So the proposal is that we'd move the code to org.apache.hive.hcatalog, though we would create shell classes and interfaces in org.apache.hcatalog for all public classes and interfaces so that it will be backward compatible.  I'm fine with doing this now.

So, let's get started.  Carl, could you create an hcatalog directory under trunk/hive and grant the listed hcat committers karma on it?  Then I'll get started on moving the actual code.

Alan.

On Feb 24, 2013, at 12:22 PM, Brock Noland wrote:

> Looks good from my perspective and I glad to see this moving forward.
>
> Regarding #4 (JIRA)
>
> "I don't know if there's a way to upload existing JIRAs into Hive's JIRA,
> but I think it would be better to leave them where they are."
>
> JIRA has a bulk move feature, but I am curious as why we would leave them
> under the old project? There might be good reason to orphan them, but my
> first thought is that it would be nice to have them under the HIVE project
> simply for search purposes.
>
> Brock
>
>
>
>
> On Fri, Feb 22, 2013 at 7:12 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> Alright, our vote has passed, it's time to get on with merging HCatalog
>> into Hive.  Here's the things I can think of we need to deal with.  Please
>> add additional issues I've missed:
>>
>> 1) Moving the code
>> 2) Dealing with domain names in the code
>> 3) The mailing lists
>> 4) The JIRA
>> 5) The website
>> 6) Committer rights
>> 7) Make a proposal for how HCat is released going forward
>> 8) Publish an FAQ
>>
>> Proposals for how we handle these:
>> Below I propose an approach for how to handle each of these.  Feedback
>> welcome.
>>
>> 1) Moving the code
>> I propose that HCat move into a subdirectory of Hive.  This fits nicely
>> into Hive's structure since it already has metastore, ql, etc.  We'd just
>> add 'hcatalog' as a new directory.  This directory would contain hcatalog
>> as it is today.  It does not follow Hive's standard build model so we'd
>> need to do some work to make it so that building Hive also builds HCat, but
>> this should be minimal.
>>
>> 2) Dealing with domain names
>> HCat code currently is under org.apache.hcatalog.  Do we want to change
>> it?  In time we probably should change it to match the rest of Hive
>> (org.apache.hadoop.hive.hcatalog).  We need to do this in a backward
>> compatible way.  I propose we leave it as is for now and if we decide to in
>> the future we can move the actual code to org.apache.hadoop.hive.hcatalog
>> and create shell classes under org.apache.hcatalog.
>>
>> 3) The mailing lists
>> Given that our goal is to merge the projects and not create a subproject
>> we should merge the mailing lists rather than keep hcat specific lists.  We
>> can ask infra to remove hcatalog-*@incubator.apache.org and forward any
>> new mail to the appropriate Hive lists.  We need to find out if they can
>> auto-subscribe people from the hcat lists to the hive lists.  Given that
>> traffic on the Hive lists is an order of magnitude higher we should warn
>> people before we auto-subscribe them and allow them a chance to get off.
>>
>> 4) JIRA
>> We can create an hcatalog component in Hive's JIRA.  All new HCat issues
>> could be filed there.  I don't know if there's a way to upload existing
>> JIRAs into Hive's JIRA, but I think it would be better to leave them where
>> they are.  We should see if infra can turn off the ability to create new
>> JIRAs in hcatalog.
>>
>> 5) Website
>> We will need to integrate HCatalog's website with Hive's.  This should be
>> easy except for the documentation.  HCat uses forrest for docs, Hive uses
>> wiki.  We will need to put links under 'Documentation' for older versions
>> of HCat docs so users can find them.  As far as how docs are handled for
>> the next version of HCatalog, I think that depends on the answer to
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB