Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Compression in Hive


Copy link to this message
-
Re: Compression in Hive
1. We use LZO compression in our MR jobs that create LZO files (these are NOT sequence files)  that are the feeder files for Hive
2. Then we we use Hive data (LZO files) and run aggregation reports

Hope this helps
Good luck
sanjay
From: "Ravi Mummulla (BIG DATA)" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Monday, June 10, 2013 6:14 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: RE: Compression in Hive

Documentation is here https://cwiki.apache.org/confluence/display/Hive/CompressedStorage. Performance overhead is trivial for larger amounts of data but may be magnified as data size gets smaller. Typically where you gain is data transfers between nodes and disk reads/writes. Again, the larger the data size the more the gain.

Thanks.

From: Sachin Sudarshana [mailto:[EMAIL PROTECTED]]
Sent: Sunday, June 9, 2013 11:04 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Compression in Hive

Hi,

I have been testing the usefulness of compression in Hive. I have a general question,

I would like to know if there are any particular cases where compression in hive can actually prove useful while running any MR jobs.

Any pointers/examples would really be useful!

Thank you,
Sachin
CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.