Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> TPC-H queries on Hive 0.12


Copy link to this message
-
Re: TPC-H queries on Hive 0.12
I remember that textfiles are used in those scripts. With 0.12, I think ORC
should be used. Also, I think those sub-queries should be merged into a
single query. With a single query, if a reduce join is converted to a map
join, this map join can be merged to its child job. But, if this join is
evaluated by an individual query, hive has to use a single map only job to
evaluate it because it does not know this map only job is used to generate
intermediate results. For query 17 and query 18, with a single query,
Correlation Optimizer should be able to optimize these two queries (set
hive.optimize.correlation=true).

Thanks,

Yin
On Fri, Nov 22, 2013 at 1:31 PM, Avrilia Floratou <
[EMAIL PROTECTED]> wrote:

> Hello,
>
> I'd like to run a few TPC-H queries on Hive 0.12. I've found the TPC-H
> scripts here:
>
> https://issues.apache.org/jira/browse/HIVE-600.
>
> but noticed that these scripts were generated a long time ago. Since Hive
> could not support full SQL-92 specification some queries were split into
> smaller sub-queries whose results have been materialized. Is there any
> change in HiveQL (in Hive 0.12) that would affect the way the TPC-H queries
> are written?
>
> Thanks,
> Avrilia
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB