Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # dev - Tez branch and tez based patches

Edward Capriolo 2013-07-13, 16:48
Alan Gates 2013-07-16, 00:37
Edward Capriolo 2013-07-16, 01:51
Alan Gates 2013-07-16, 18:24
Edward Capriolo 2013-07-16, 20:08
Copy link to this message
Re: Tez branch and tez based patches
Edward Capriolo 2013-07-17, 05:20
There are some points I want to bring up. First, I am on the PMC. Here is
something I find relevant:



The role of the PMC from a Foundation perspective is oversight. The main
role of the PMC is not code and not coding - but to ensure that all legal
issues are addressed, that procedure is followed, and that each and every
release is the product of the community as a whole. That is key to our
litigation protection mechanisms.

Secondly the role of the PMC is to further the long term development and
health of the community as a whole, and to ensure that balanced and wide
scale peer review and collaboration does happen. Within the ASF we worry
about any community which centers around a few individuals who are working
virtually uncontested. We believe that this is detrimental to quality,
stability, and robustness of both code and long term social structures.




All other decisions happen on the dev list, discussions on the private list
are kept to a minimum.

"If it didn't happen on the dev list, it didn't happen" - which leads to:

a) Elections of committers and PMC members are published on the dev list
once finalized.

b) Out-of-band discussions (IRC etc.) are summarized on the dev list as
soon as they have impact on the project, code or community.

https://issues.apache.org/jira/browse/HIVE-4660 ironically titled "Let
their be Tez" has not be +1 ed by any committer. It was never discussed on
the dev or the user list (as far as I can tell).

As a PMC member I feel we need more discussion on Tez on the dev list along
with a wiki-fied design document. Topics of discussion should include:

1) What is tez?

2) How is tez different from oozie, http://code.google.com/p/hop/,
http://cs.brown.edu/~backman/cmr.html , and other DAG and or streaming map
reduce tools/frameworks? Why should we use this and not those?

3) When can we expect the first tez release?

4) How much effort is involved in integrating hive and tez?

5) Who is ready to commit to this effort?

6) can we expect this work to be done in one hive release?

In my opinion we should not start any work on this tez-hive until these
questions are answered to the satisfaction of the hive developers.
On Mon, Jul 15, 2013 at 9:51 PM, Edward Capriolo <[EMAIL PROTECTED]>wrote:

> >>The Hive bylaws,
> https://cwiki.apache.org/confluence/display/Hive/Bylaws , lay out what
> votes are needed for what.  I don't see anything there about needing 3 +1s
> for a branch.  Branching >>would seem to fall under code change, which
> requires one vote and a minimum length of 1 day.
> You could argue that all you need is one +1 to create a branch, but this
> is more then a branch. If you are talking about something that is:
> 1) going to cause major re-factoring of critical pieces of hive like
> ExecDriver and MapRedTask
> 2) going to be very disruptive to the efforts of other committers
> 3) something that may be a major architectural change
> Getting the project on board with the idea is a good idea.
> Now I want to point something out. Here are some recent initiatives in
> hive:
> 1) At one point there was a big initiative to "support oracle" after the
> initial work, there are patches in Jira no one seems to care about oracle
> support.
> 2) Another such decisions was this "support windows" one, there are
> probably 4 windows patches waiting reviews.
> 3) I still have no clue what the official hadoop1 hadoop2, hadoop 0.23
> support prospective is, but every couple weeks we get another jira about
> something not working/testing on one of those versions, seems like several
> builds are broken.
> 4) Hive-storage handler, after the initial implementation no one cares to
> review any other storage handler implementation, 3 patches there or more,
Alan Gates 2013-07-17, 19:35
Edward Capriolo 2013-07-17, 20:41
Ashutosh Chauhan 2013-07-18, 00:43
Edward Capriolo 2013-07-20, 15:10
Gunther Hagleitner 2013-07-23, 00:08
Alan Gates 2013-07-17, 21:41
Edward Capriolo 2013-07-30, 04:02
Edward Capriolo 2013-07-30, 04:53
Alan Gates 2013-08-05, 17:54
Edward Capriolo 2013-08-16, 13:13
Edward Capriolo 2013-08-16, 14:54
Alan Gates 2013-08-05, 17:40
Brock Noland 2013-07-16, 15:56