Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - LZO Compression on trunk


Copy link to this message
-
LZO Compression on trunk
Bennie Schut 2010-02-05, 11:22
I have a tab separated files I have loaded it with "load data inpath"
then I do a

SET hive.exec.compress.output=true;
SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
SET mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
select distinct login_cldr_id as cldr_id from chatsessions_load;

Ended Job = job_201001151039_1641
OK
NULL
NULL
NULL
Time taken: 49.06 seconds

however if I start it without the set commands I get this:
Ended Job = job_201001151039_1642
OK
2283
Time taken: 45.308 seconds

Which is the correct result.

When I do a "insert overwrite" on a rcfile table it will actually
compress the data correctly.
When I disable compression and query this new table the result is correct.
When I enable compression it's wrong again.
I see no errors in the logs.

Any idea's why this might happen?