Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> HIVE optimizer enhancements in 0.9.0+ releases


Copy link to this message
-
HIVE optimizer enhancements in 0.9.0+ releases
Hi,

I am a HIVE user who is working on anlytical applications on large
data sets. For us, the HIVE performance is critical for the success of
our product. I was wondering if there are any recent improvements that
were made in the optimizer layer.  One of the relevant references I
found on the web is the HIVE paper
(http://infolab.stanford.edu/~ragho/hive-icde2010.pdf) . If you can
send me any pointers on current enhancements, that would be great.

Some specific improvements I am looking for are:
1. Cost based optimization (logical or physical)
2. "multi-query optimization techniques and performing generic n-way
joins in a single map-reduce job" (quoted from the future work section
of the paper above)
3. Using and generation of table statistics for generation of
betterplans/faster execution etc. I know there was some code added to
generate column statistics for HIVE tables. Any table level statistics
generation?

Thanks for your help,
-Sukhendu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB