Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Multi-group-by select always scans entire table


Copy link to this message
-
Re: Multi-group-by select always scans entire table
Hi Jan,
Glad you found something workable.

What version of Hive are you using? Could you also please check what the value of the property hive.optimize.ppd is for you?

Thanks,
Mark

----- Original Message -----
From: "Jan Dolinár" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Tuesday, May 29, 2012 1:57:25 AM
Subject: Re: Multi-group-by select always scans entire table
On Fri, May 25, 2012 at 12:03 PM, Jan Dolinár < [EMAIL PROTECTED] > wrote:
-- see what happens when you try to perform multi-group-by query on one of the partitions
EXPLAIN EXTENDED
FROM partition_test
LATERAL VIEW explode(col1) tmp AS exp_col1
INSERT OVERWRITE DIRECTORY '/test/1'
SELECT exp_col1
WHERE (part_col=2)
INSERT OVERWRITE DIRECTORY '/test/2'
SELECT exp_col1
WHERE (part_col=2);
-- result: it wants to scan all partitions :-(
Since nobody else did, let me answer myself... In the end I found out that the correct partition pruning can be achieved using subquery. Continuing the example from my last post, the query would be:
FROM (
SELECT * FROM partition_test
LATERAL VIEW explode(col1) tmp AS exp_col1
WHERE part_col=2
) t
INSERT OVERWRITE DIRECTORY '/test/1'
SELECT exp_col1
INSERT OVERWRITE DIRECTORY '/test/2'
SELECT exp_col1;
I still think the pruning should work correctly no matter how the query is written, but for now I'm happy with this solution.
J. Dolinar
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB