Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> hive - snappy and sequence file vs RC file


Copy link to this message
-
Re: hive - snappy and sequence file vs RC file
SequenceFile compared to RCFile:
  * More widely deployed.
  * Available from MapReduce and Pig
  * Doesn't compress as small (in RCFile all of each columns values are put
together)
  * Uncompresses and deserializes all of the columns, even if you are only
reading a few

In either case, for long term storage, you should seriously consider the
default codec since that will provide much tighter compression (at the cost
of cpu to compress it).

-- Owen