Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Is this a known Bug: Multi Inserts from partitioned source ignore Where Clauses


+
John Omernik 2013-01-26, 15:17
Copy link to this message
-
Re: Is this a known Bug: Multi Inserts from partitioned source ignore Where Clauses
This is a known (recently fixed) bug:

https://issues.apache.org/jira/browse/HIVE-3699

Phil.
On 26 January 2013 15:17, John Omernik <[EMAIL PROTECTED]> wrote:

> I ran into an interesting bug. Basically, if your FROM() source is
> a partitioned table and you use a where clause that prunes, all of the
> INSERT HERE SELECT * WHERE x=y ignores each specified where clause.  This
> does not occur if the source partition is not specified, but if the source
> as where partition = 'x' then the where on each individual insert is
> ignored...
>
> I've included some files here
>
> testdata.tsv - Tab delimited data to prove the issue
> create_tables.hive - Creates a database and tables as well as loads the
> data from the TSV
>
> Test Cases:
> I created these test case files in a way that there are three types of
> insert in each case: 1. Load all data from initial statement, 2. Load
> partial data (use a limiting clause such as where day >= '2013-01-05', and
> 3 Load NO data from the initial statement (where 1 = 0)
>
> These tests are all run on hive 0.9
>
> multi-flat-flat.hive - The source table and the dest tables are not
> partitioned, the where clauses work as expected:
>
> 19 Rows loaded to multi_bug_flat
> 0 Rows loaded to multi_bug_flat3
> 15 Rows loaded to multi_bug_flat2
>
> multi-part-part.hive - The source table and the dest tables are
> partitioned. The where clauses are not honored.
>
> 9 Rows loaded to multi_bug_part3
> 9 Rows loaded to multi_bug_part2
> 9 Rows loaded to multi_bug_part
>
> multi-flat-part.hive - The source table is flat, the dest table is
> partitioned - The where clauses work as expected:
>
> 0 Rows loaded to multi_bug_part3
> 15 Rows loaded to multi_bug_part2
> 19 Rows loaded to multi_bug_part
>
> multi-part-flat.hive - The source table is partitioned, the dest table is
> flat - The where clauses are not honored:
>
> 9 Rows loaded to multi_bug_flat
> 9 Rows loaded to multi_bug_flat3
> 9 Rows loaded to multi_bug_flat2
>
> multi-part-specified.hive - The source and dest are partitioned, but there
> is no partition pruning statement in the from ()  this works as expected
>
> 0 Rows loaded to multi_bug_part3
> 15 Rows loaded to multi_bug_part2
> 19 Rows loaded to multi_bug_part
>
>
> Thoughts?
>
+
John Omernik 2013-01-26, 15:27
+
Navis류승우 2013-01-28, 00:53