Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Review Request 12827: HIVE-4611 - SMB joins fail based on bigtable selection policy.


Copy link to this message
-
Re: Review Request 12827: HIVE-4611 - SMB joins fail based on bigtable selection policy.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12827/#review23905
-----------------------------------------------------------

ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
<https://reviews.apache.org/r/12827/#comment47730>

    Can you add which table(s) those are ({0})?

ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47732>

    you need a check here too, don't you? Otherwise you might return 0 even though that's not a valid choice.

ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47738>

    what's this for? is this an error condition? not sure when topOp would be null, but you might jump over it with the new code, is that what you really want?

ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47742>

    This should be a List<>, right?

ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47743>

    List?

ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
<https://reviews.apache.org/r/12827/#comment47748>

    I think only when you allocate you really need to specify ArrayList. There's multiple instances of that.

ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
<https://reviews.apache.org/r/12827/#comment47750>

    Also, why can't seenPositions and bigTableCandidates be Sets? Seems more appropriate.

ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
<https://reviews.apache.org/r/12827/#comment47751>

    This is confusing. Can you add comment explaining under what circumstances you return false v throw exceptions and why?

ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47753>

    Same as above.

ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java
<https://reviews.apache.org/r/12827/#comment47752>

    This one has the topOp and bigTableCandidate list reversed.
- Gunther Hagleitner
On July 22, 2013, 10:55 p.m., Vikram Dixit Kumaraswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12827/
> -----------------------------------------------------------
>
> (Updated July 22, 2013, 10:55 p.m.)
>
>
> Review request for hive, Ashutosh Chauhan, Brock Noland, and Gunther Hagleitner.
>
>
> Bugs: HIVE-4611
>     https://issues.apache.org/jira/browse/HIVE-4611
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> SMB joins fail based on bigtable selection policy. The default setting for hive.auto.convert.sortmerge.join.bigtable.selection.policy will choose the big table as the one with largest average partition size. However, this can result in a query failing because this policy conflicts with the big table candidates chosen for outer joins. This policy should just be a tie breaker and not have the ultimate say in the choice of tables.
>
>
> Diffs
> -----
>
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 8330f65
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java cc9de54
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/AvgPartitionSizeBasedBigTableSelectorForAutoSMJ.java 5320143
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/BigTableSelectorForAutoSMJ.java db5ff0f
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/LeftmostBigTableSelectorForAutoSMJ.java db3c9e7
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java d83fb66
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/TableSizeBasedBigTableSelectorForAutoSMJ.java b882f87