Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - hive - snappy and sequence file vs RC file


Copy link to this message
-
Re: hive - snappy and sequence file vs RC file
Owen O'Malley 2012-06-26, 16:49
SequenceFile compared to RCFile:
  * More widely deployed.
  * Available from MapReduce and Pig
  * Doesn't compress as small (in RCFile all of each columns values are put
together)
  * Uncompresses and deserializes all of the columns, even if you are only
reading a few

In either case, for long term storage, you should seriously consider the
default codec since that will provide much tighter compression (at the cost
of cpu to compress it).

-- Owen