Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # dev >> Tez branch and tez based patches


+
Edward Capriolo 2013-07-13, 16:48
+
Alan Gates 2013-07-16, 00:37
+
Edward Capriolo 2013-07-16, 01:51
+
Alan Gates 2013-07-16, 18:24
+
Edward Capriolo 2013-07-16, 20:08
+
Edward Capriolo 2013-07-17, 05:20
+
Alan Gates 2013-07-17, 19:35
+
Edward Capriolo 2013-07-17, 20:41
+
Ashutosh Chauhan 2013-07-18, 00:43
Copy link to this message
-
Re: Tez branch and tez based patches
I agree we are getting into grey area with the term disruptive. For
reference ( I have not been doing this all the time bad on me) we are
supposed to +1 and wait a day.

>> I am not familiar with these other engines, but the short answer is that
>> Tez is built to work on YARN, which works well for Hive since it is tied
>> to Hadoop

I understand what you are saying here yarn support is a plus. However the
rest of the answer is something relevant to the discussion.

There are already frameworks like spark that are semi popular.
http://www.slideshare.net/jetlore/spark-and-shark-lightningfast-analytics-over-hadoop-and-hive-data.
There are also other framworks like s4 http://incubator.apache.org/s4/, or
storm.

A big part of making a design decision is doing a competitive analysis.
Usually asking yourself "What else for this is already out there?" or "Can
this be done other ways?"
I do want to be convinced we do not lock into tez too early with tunnel
vision. Possibly we should be thinking on how to build hive in such a way
that many different frameworks could plug in. In other words convincing
that tez is the best choice, since many people are claiming an mrr type
solution.

I will watch the video you posted and study the material myself as well.
On Wed, Jul 17, 2013 at 8:43 PM, Ashutosh Chauhan <[EMAIL PROTECTED]>wrote:

> On Wed, Jul 17, 2013 at 1:41 PM, Edward Capriolo <[EMAIL PROTECTED]
> >wrote:
>
> >
> > "In my opinion we should limit the amount of tez related optimizations to
> > and trunk" Refactoring that cleans up code is good, but as you have
> pointed
> > out there wont be a tez release until sometime this fall, and this branch
> > will be open for an extended period of time. Thus code cleanups and other
> > tez related refactoring does not need to be disruptive to trunk.
>
>
> I agree Tez specific changes need not to go in trunk. But general
> refactoring and code cleanup needs to happen on trunk as and when someone
> is willing to work on those. We have to continually improve our code
> quality. Code maintainability and readability is a priority. Without that
> code quality suffers and discourages new contributors to contribute because
> code is unnecessarily complicated. SemanticAnalyzer is 11K line class. We
> need to simplify it. Patch like HIVE-4811 is a welcome change which tackled
> it. Exec package is all convoluted which mixes up runtime operators and
> drivers for runtime. Thats a welcome patch because it makes it much more
> easy to read and reason about that piece of code. HIVE-4825 is another
> example which improves modularity of code. For contributors who are exposed
> to Hive first time it will be easier for them to follow the code.
>
> Rather than disruptive to trunk, they are constructive for trunk and I am
> glad people are choosing to work on that. Tez or no Tez Hive is better off
> with these patches.
>
> Thanks,
> Ashutosh
>
>
>
> >  On Wed, Jul 17, 2013 at 3:35 PM, Alan Gates <[EMAIL PROTECTED]>
> > wrote:
> >
> > > Answers to some of your questions inlined.
> > >
> > > Alan.
> > >
> > > On Jul 16, 2013, at 10:20 PM, Edward Capriolo wrote:
> > >
> > > > There are some points I want to bring up. First, I am on the PMC.
> Here
> > is
> > > > something I find relevant:
> > > >
> > > > http://www.apache.org/foundation/how-it-works.html
> > > >
> > > > ------------------------------
> > > >
> > > > The role of the PMC from a Foundation perspective is oversight. The
> > main
> > > > role of the PMC is not code and not coding - but to ensure that all
> > legal
> > > > issues are addressed, that procedure is followed, and that each and
> > every
> > > > release is the product of the community as a whole. That is key to
> our
> > > > litigation protection mechanisms.
> > > >
> > > > Secondly the role of the PMC is to further the long term development
> > and
> > > > health of the community as a whole, and to ensure that balanced and
> > wide
> > > > scale peer review and collaboration does happen. Within the ASF we
+
Gunther Hagleitner 2013-07-23, 00:08
+
Alan Gates 2013-07-17, 21:41
+
Edward Capriolo 2013-07-30, 04:02
+
Edward Capriolo 2013-07-30, 04:53
+
Alan Gates 2013-08-05, 17:54
+
Edward Capriolo 2013-08-16, 13:13
+
Edward Capriolo 2013-08-16, 14:54
+
Alan Gates 2013-08-05, 17:40
+
Brock Noland 2013-07-16, 15:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB