Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Re: Hive Loading Zip CSV Files


Copy link to this message
-
Re: Hive Loading Zip CSV Files
Mark Grover 2012-11-13, 18:54
bcc: cdh-user

This question might be more appropriate for the Apache Hive user list, so
redirecting it there.

However to answer your question:
>From the little I've read about PKZip, they follow the standard zip format.
So the question you are really asking is if Hive supports reading from zip
files. As far as I know, the answer is no. This is because Hadoop doesn't
have an InputFormat for reading zip files:
https://issues.apache.org/jira/browse/MAPREDUCE-210
There is also a Hive user email thread that tackles the same question:
http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%[EMAIL PROTECTED]%3E

Having said that, a possible workaround would be to unzip the zip files and
use a different compression codec (e.g. Snappy) on SequenceFile's for
storing your files on HDFS.

Good luck!
Mark

On Tue, Nov 13, 2012 at 9:17 AM, ben <[EMAIL PROTECTED]> wrote:

> Anybody ever try to load CSV files compressed using PKZip into a Hive
> table stored as Sequence Files? Is there a SerDe out there for this?
>
> Thanks,
> Ben
>
> --
>
>
>
>