Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Review Request 14130: Merge vectorization branch to trunk


Copy link to this message
-
Re: Review Request 14130: Merge vectorization branch to trunk
Carl Steinbach 2013-09-19, 05:02


> On Sept. 16, 2013, 12:02 a.m., Carl Steinbach wrote:
> >
>
> Jitendra Pandey wrote:
>     I have filed jiras HIVE-5308, HIVE-5309, HIVE-5314 to address some of the review comments. I will upload an updated patch as soon as those jiras are committed. The individual comments are answered inline.

Thanks for addressing these issues.
> On Sept. 16, 2013, 12:02 a.m., Carl Steinbach wrote:
> > ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt, line 1
> > <https://reviews.apache.org/r/14130/diff/1/?file=351623#file351623line1>
> >
> >     We currently use Apache Velocity to generate test code at compile-time (e.g. TestCliDriver, ...). I realize that the templating code in CodeGen and TestCodeGen is pretty simple, but was wondering if it might be better from a build and maintenance standpoint to use Velocity instead.
> >    
> >     Also, is it possible to select a less generic file suffix for the template files, e.g. *.t or *.tmpl?
>
> Jitendra Pandey wrote:
>       I have added a patch on HIVE-5308, that removes the generated code for both expressions and tests and instead the code is generated as an ant-task during the build.
>       The patch doesn't however uses Velocity, do you think its ok to move to velocity as a follow up task post merge? I will take care of template file suffixes as part of move to velocity because I believe velocity requires its own suffix.

I haven't looked at it closely enough to be able to say that using Velocity is a superior approach. I think it's something worth investigating in the future, but agree with you that there's no need to do it right away.
> On Sept. 16, 2013, 12:02 a.m., Carl Steinbach wrote:
> > ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt, line 27
> > <https://reviews.apache.org/r/14130/diff/1/?file=351623#file351623line27>
> >
> >     In addition to the name (and preferably path) of the template I think this comment should also include the name and path of the code generator, and a warning that it should not be modified by hand.
>
> Jitendra Pandey wrote:
>     The generated code will not be committed anymore. The code will be generated in the build directory and a clean will remove it.

Awesome!
> On Sept. 16, 2013, 12:02 a.m., Carl Steinbach wrote:
> > ql/src/gen/vectorization/org/apache/hadoop/hive/ql/exec/vector/gen/TestCodeGen.java, line 1
> > <https://reviews.apache.org/r/14130/diff/1/?file=351645#file351645line1>
> >
> >     Maybe this should go in ql/src/test/gen. Thoughts?
>
> Jitendra Pandey wrote:
>     I have renamed this file to GenVectorTestCode and moved it to ant/src/org/apache/hadoop/hive/ant/ so that it runs along with other ant-tasks. Is that ok? The jira with this change is HIVE-5308.

Sounds good.
> On Sept. 16, 2013, 12:02 a.m., Carl Steinbach wrote:
> > ql/src/test/queries/clientpositive/vectorization_0.q, line 20
> > <https://reviews.apache.org/r/14130/diff/1/?file=351959#file351959line20>
> >
> >     What is the expected behavior when vectorized.execution=enabled and the source table is not reading ORC formatted data? I think it's worth adding some additional tests (either positive or negative) to lock down this behavior.
>
> Jitendra Pandey wrote:
>     Any query that vectorization cannot execute must fall back to non-vector mode. If the input format doesn't provide vectorized input the query must still succeed. I have added a test in my patch on HIVE-5309 that runs a query on rc-file table with vectorization turned on. The query executes successfully.

Sounds good.
- Carl
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14130/#review26119
-----------------------------------------------------------
On Sept. 13, 2013, 5:51 p.m., Jitendra Pandey wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14130/