Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Review Request: Add Vectorized Substr


Copy link to this message
-
Re: Review Request: Add Vectorized Substr

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/11106/#review20603
-----------------------------------------------------------

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42444>

    You can't look at isNull entries if noNulls is true for the input, otherwise you could get wrong results.
    
    noNulls = false means there may or may not be true entries in the isNull array, even for the isRepeating case

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42447>

    For this special case (has nulls, selected not in use) please speed things up a bit by using System.arraycopy to copy the input isNull array to the output isNull array rather than doing it with individual assignment statements.

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42448>

    missing period.

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42450>

    must check noNulls before looking at isNull array

ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java
<https://reviews.apache.org/r/11106/#comment42451>

    Use system.arraycopy to copy over the isNull array for this special case instead of using individual assignment statements

ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42452>

    Need to have a string with multi-byte characters (or else a separate test of your helper functions using multi-byte characters). One of the other tests has some multi-byte chars in some of the strings -- check that for an example.

ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42454>

    All strings are starting at 0, so you won't test your helper functions for the case of a string starting in the middle of a larger byte buffer. Need to test that case either here or in a separate test case for the helper functions.

ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42456>

    also verify that the output noNulls and isRepeating get set correctly. E.g. set them to the wrong value before test then test them afterwards to be sure they are right.

ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java
<https://reviews.apache.org/r/11106/#comment42453>

    some lines are longer than 100 char limit. Run ant checkstyle on this.
- Eric Hanson
On May 15, 2013, 1:16 a.m., Timothy Chen wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/11106/
> -----------------------------------------------------------
>
> (Updated May 15, 2013, 1:16 a.m.)
>
>
> Review request for hive.
>
>
> Description
> -------
>
> Add Vectorized Substr
>
>
> This addresses bug HIVE-4495.
>     https://issues.apache.org/jira/browse/HIVE-4495
>
>
> Diffs
> -----
>
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/StringSubstrColStartLen.java PRE-CREATION
>   ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java 6e26412
>
> Diff: https://reviews.apache.org/r/11106/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Timothy Chen
>
>