Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> ORC Tuning - Examples?


Copy link to this message
-
ORC Tuning - Examples?
I am looking for guidance (read examples) on tuning ORC settings for my
data.  I see the documentation that shows the defaults, as well as a brief
description of what it is.  What I am looking for is some examples of
things to try.  *Note: I understand that nobody wants to make sweeping
declaring of set this setting without knowing the data*  That said, I would
love to see some examples, specifically around:

orc.row.index.stride

orc.compress.size

orc.stripe.size
For example, I'd love to see some statements like:
If your data has lots of columns of small data, and you'd like better x,
try changing y setting because this allows hive to do z when querying.
If your data has few columns of large data, try changing y and this allows
hive to do z while querying.
It would be really neat to see some examples so we can get in and tune our
data. Right now, everything is a crapshoot for me, and I don't know if
there are detrimental affects that may make themselves known later.
Any input would be welcome.
+
Lefty Leverenz 2013-11-13, 01:51
+
Yin Huai 2013-11-13, 19:13
+
John Omernik 2013-11-13, 21:44
+
Yin Huai 2013-11-14, 04:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB