Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Is this a known Bug: Multi Inserts from partitioned source ignore Where Clauses


+
John Omernik 2013-01-26, 15:17
Copy link to this message
-
Re: Is this a known Bug: Multi Inserts from partitioned source ignore Where Clauses
This is a known (recently fixed) bug:

https://issues.apache.org/jira/browse/HIVE-3699

Phil.
On 26 January 2013 15:17, John Omernik <[EMAIL PROTECTED]> wrote:

> I ran into an interesting bug. Basically, if your FROM() source is
> a partitioned table and you use a where clause that prunes, all of the
> INSERT HERE SELECT * WHERE x=y ignores each specified where clause.  This
> does not occur if the source partition is not specified, but if the source
> as where partition = 'x' then the where on each individual insert is
> ignored...
>
> I've included some files here
>
> testdata.tsv - Tab delimited data to prove the issue
> create_tables.hive - Creates a database and tables as well as loads the
> data from the TSV
>
> Test Cases:
> I created these test case files in a way that there are three types of
> insert in each case: 1. Load all data from initial statement, 2. Load
> partial data (use a limiting clause such as where day >= '2013-01-05', and
> 3 Load NO data from the initial statement (where 1 = 0)
>
> These tests are all run on hive 0.9
>
> multi-flat-flat.hive - The source table and the dest tables are not
> partitioned, the where clauses work as expected:
>
> 19 Rows loaded to multi_bug_flat
> 0 Rows loaded to multi_bug_flat3
> 15 Rows loaded to multi_bug_flat2
>
> multi-part-part.hive - The source table and the dest tables are
> partitioned. The where clauses are not honored.
>
> 9 Rows loaded to multi_bug_part3
> 9 Rows loaded to multi_bug_part2
> 9 Rows loaded to multi_bug_part
>
> multi-flat-part.hive - The source table is flat, the dest table is
> partitioned - The where clauses work as expected:
>
> 0 Rows loaded to multi_bug_part3
> 15 Rows loaded to multi_bug_part2
> 19 Rows loaded to multi_bug_part
>
> multi-part-flat.hive - The source table is partitioned, the dest table is
> flat - The where clauses are not honored:
>
> 9 Rows loaded to multi_bug_flat
> 9 Rows loaded to multi_bug_flat3
> 9 Rows loaded to multi_bug_flat2
>
> multi-part-specified.hive - The source and dest are partitioned, but there
> is no partition pruning statement in the from ()  this works as expected
>
> 0 Rows loaded to multi_bug_part3
> 15 Rows loaded to multi_bug_part2
> 19 Rows loaded to multi_bug_part
>
>
> Thoughts?
>
+
John Omernik 2013-01-26, 15:27
+
Navis류승우 2013-01-28, 00:53
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB