Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> ORC vs TEXT file

Copy link to this message
ORC vs TEXT file

Currently, we use TEXTFILE format in hive 0.8 ,while creating the
external tables in intermediate processing .
I have read about ORC in 0.11. I have created the same table in 0.11
with ORC format.
Without any compression, the ORC file(totally 3 files) occupied the
space twice more than the TEXTFILE(only one file).
Even, when i query the data from ORC:
Select count(*) from orc_table

It took more time than the same query against textfile.
But, i see cumulative CPU time is lesser in ORC than the text file.

What sort of queries will benefit, if we use ORC?
In which cases TEXTFILE will be preferred more than ORC?