Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Tez branch and tez based patches


Copy link to this message
-
Re: Tez branch and tez based patches
>>The Hive bylaws,  https://cwiki.apache.org/confluence/display/Hive/Bylaws, lay out what votes are needed for what.  I don't see anything there about
needing 3 +1s for a branch.  Branching >>would seem to fall under code
change, which requires one vote and a minimum length of 1 day.

You could argue that all you need is one +1 to create a branch, but this is
more then a branch. If you are talking about something that is:
1) going to cause major re-factoring of critical pieces of hive like
ExecDriver and MapRedTask
2) going to be very disruptive to the efforts of other committers
3) something that may be a major architectural change

Getting the project on board with the idea is a good idea.

Now I want to point something out. Here are some recent initiatives in hive:

1) At one point there was a big initiative to "support oracle" after the
initial work, there are patches in Jira no one seems to care about oracle
support.
2) Another such decisions was this "support windows" one, there are
probably 4 windows patches waiting reviews.
3) I still have no clue what the official hadoop1 hadoop2, hadoop 0.23
support prospective is, but every couple weeks we get another jira about
something not working/testing on one of those versions, seems like several
builds are broken.
4) Hive-storage handler, after the initial implementation no one cares to
review any other storage handler implementation, 3 patches there or more,
could not even find anyone willing to review the cassandra storage handler
I spent months on.
5) OCR, Vectorization
6) Windowing: committed, numerous check-style violations.

We have !!!160+!!! PATCH_AVAILABLE Jira issues. Few active committers. We
are spread very thin, and embarking on another side project not involved
with core hive seems like the wrong direction at the moment.


On Mon, Jul 15, 2013 at 8:37 PM, Alan Gates <[EMAIL PROTECTED]> wrote:

>
> On Jul 13, 2013, at 9:48 AM, Edward Capriolo wrote:
>
> > I have started to see several re factoring patches around tez.
> > https://issues.apache.org/jira/browse/HIVE-4843
> >
> > This is the only mention on the hive list I can find with tez:
> > "Makes sense. I will create the branch soon.
> >
> > Thanks,
> > Ashutosh
> >
> >
> > On Tue, Jun 11, 2013 at 7:44 PM, Gunther Hagleitner <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi,
> >>
> >> I am starting to work on integrating Tez into Hive (see HIVE-4660,
> design
> >> doc has already been uploaded - any feedback will be much appreciated).
> >> This will be a fair amount of work that will take time to
> stabilize/test.
> >> I'd like to propose creating a branch in order to be able to do this
> >> incrementally and collaboratively. In order to progress rapidly with
> this,
> >> I would also like to go "commit-then-review".
> >>
> >> Thanks,
> >> Gunther.
> >> "
> >
> > These refactor-ings are largely destructive to a number of bugs and
> > language improvements in hive.The language improvements and bug fixes
> that
> > have been sitting in Jira for quite some time now marked patch-available
> > and are waiting for review.
> >
> > There are a few things I want to point out:
> > 1) Normally we create design docs in out wiki (which it is not)
> > 2) Normally when the change is significantly complex we get multiple
> > committers to comment on it (which we did not)
> > On point 2 no one -1  the branch, but this is really something that
> should
> > have required a +1 from 3 committers.
>
> The Hive bylaws,  https://cwiki.apache.org/confluence/display/Hive/Bylaws, lay out what votes are needed for what.  I don't see anything there about
> needing 3 +1s for a branch.  Branching would seem to fall under code
> change, which requires one vote and a minimum length of 1 day.
>
> >
> > I for one am not completely sold on Tez.
> > http://incubator.apache.org/projects/tez.html.
> > "directed-acyclic-graph of tasks for processing data" this description
> > sounds like many things which have never become popular. One to think of