Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - ORC Tuning - Examples?


Copy link to this message
-
ORC Tuning - Examples?
John Omernik 2013-11-12, 23:15
I am looking for guidance (read examples) on tuning ORC settings for my
data.  I see the documentation that shows the defaults, as well as a brief
description of what it is.  What I am looking for is some examples of
things to try.  *Note: I understand that nobody wants to make sweeping
declaring of set this setting without knowing the data*  That said, I would
love to see some examples, specifically around:

orc.row.index.stride

orc.compress.size

orc.stripe.size
For example, I'd love to see some statements like:
If your data has lots of columns of small data, and you'd like better x,
try changing y setting because this allows hive to do z when querying.
If your data has few columns of large data, try changing y and this allows
hive to do z while querying.
It would be really neat to see some examples so we can get in and tune our
data. Right now, everything is a crapshoot for me, and I don't know if
there are detrimental affects that may make themselves known later.
Any input would be welcome.