Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # dev - Review Request 12705: HIVE-4878: With Dynamic partitioning, some queries would scan default partition even if query is not using it.


+
John Pullokkaran 2013-07-17, 22:19
Copy link to this message
-
Re: Review Request 12705: HIVE-4878: With Dynamic partitioning, some queries would scan default partition even if query is not using it.
Ashutosh Chauhan 2013-07-22, 22:05

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12705/#review23657
-----------------------------------------------------------

ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java
<https://reviews.apache.org/r/12705/#comment47555>

    Why are we restricting this for strict mode? We should skip default partition in all cases unless explicitly requested by user. Assumption is default partition contains rows which were malformed in some ways at load times and will be excluded from all further query processing.
- Ashutosh Chauhan
On July 17, 2013, 10:19 p.m., John Pullokkaran wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12705/
> -----------------------------------------------------------
>
> (Updated July 17, 2013, 10:19 p.m.)
>
>
> Review request for hive and Ashutosh Chauhan.
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> With Dynamic partitioning, Hive would scan default partitions in some cases even if query excludes it. As part of partition pruning, predicate is narrowed down to those pieces that involve partition columns only. This predicate is then evaluated with partition values to determine, if scan should include those partitions.
> But in some cases (like when comparing "_HIVE_DEFAULT_PARTITION_" to numeric data types) expression evaluation would fail and would return NULL instead of true/false. In such cases the partition is added to unknown partitions which is then subsequently scanned.
>
> This fix avoids scanning default partition if all of the following is true:
> a) Hive dynamic partition mode is strict (hive.exec.dynamic.partition.mode=strict).
> b) partition pruning expression failed to evaluate for a given partition.
> c) at the least one of the columns in the partition is default partition.
>
>
> Diffs
> -----
>
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 6a4a360
>   ql/src/test/queries/clientpositive/dynamic_partition_skip_default.q PRE-CREATION
>   ql/src/test/results/clientpositive/dynamic_partition_skip_default.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/12705/diff/
>
>
> Testing
> -------
>
> Hive Unit Tests Passed.
>
>
> Thanks,
>
> John Pullokkaran
>
>

+
John Pullokkaran 2013-07-23, 01:49
+
John Pullokkaran 2013-07-24, 20:16