Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Best format to use

Copy link to this message
Re: Best format to use
Hey Mark,

Gzip codec creates extension .gzip, not .deflate (which is
DeflateCodec). You may want to re-check your settings.

Impala questions are best resolved at its current user and developer
community at https://groups.google.com/a/cloudera.org/forum/#!forum/impala-user.
Impala does currently support LZO (and also Indexed LZO) compressed
text files however, so you may want to try that as its splittable
(compared to Gzip ones).

On Tue, Apr 9, 2013 at 5:18 AM, Mark <[EMAIL PROTECTED]> wrote:
> Trying to determine what the best format to use for storing daily logs. We recently switch from snappy (.snappy) to gzip (.deflate) but I'm wondering if there is something better? Our main clients for these daily logs are pig and hive using an external table. We were thinking about testing out impala but we see that it doesn't work with compressed text files. Any suggestions?
> Thanks

Harsh J