Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
Hi Sachin,
LIke you say looks like something to do with the GZipCodec all right. And
that would make sense given your original problem.

Yeah, one would think it'd be in there by default but for whatever reason
its not finding it but at least the problem is now identified.

Now _my guess_ is that maybe your hadoop core-site.xml file might need to
list the codecs available under the property name:
"io.compression.codecs".  Can you chase that up as a possibility and let us
know what you find out?
On Thu, Jun 6, 2013 at 4:02 AM, Sachin Sudarshana
<[EMAIL PROTECTED]>wrote:

> Hi Stephen,
>
> *hive> show create table facts520_normal_text;*
> *OK*
> *CREATE  TABLE facts520_normal_text(*
> *  fact_key bigint,*
>  *  products_key int,*
> *  retailers_key int,*
> *  suppliers_key int,*
> *  time_key int,*
> *  units int)*
> *ROW FORMAT DELIMITED*
> *  FIELDS TERMINATED BY ','*
> *  LINES TERMINATED BY '\n'*
> *STORED AS INPUTFORMAT*
> *  'org.apache.hadoop.mapred.TextInputFormat'*
> *OUTPUTFORMAT*
> *  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'*
> *LOCATION*
> *  'hdfs://
> aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'*
> *TBLPROPERTIES (*
> *  'numPartitions'='0',*
> *  'numFiles'='1',*
> *  'transient_lastDdlTime'='1369395430',*
> *  'numRows'='0',*
> *  'totalSize'='545216508',*
> *  'rawDataSize'='0')*
> *Time taken: 0.353 seconds*
>
>
> The syserror log shows this:
>
> *java.lang.IllegalArgumentException: Compression codec
> org.apache.hadoop.io.compress.GZipCodec was not found.*
> * at
> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85)
> *
> * at
> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934)
> *
> * at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469)
> *
> * at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543)
> *
> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
> * at
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
> *
> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
> * at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
> *
> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
> * at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)*
> * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)*
> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
> * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
> * at java.security.AccessController.doPrivileged(Native Method)*
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> *
> * at org.apache.hadoop.mapred.Child.main(Child.java:262)*
> *Caused by: java.lang.ClassNotFoundException: Class
> org.apache.hadoop.io.compress.GZipCodec not found*
> * at
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493)
> *
> * at
> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82)
> *
> * ... 21 more*
> *java.lang.IllegalArgumentException: Compression codec
> org.apache.hadoop.io.compress.GZipCodec was not found.*
> * at
> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85)
> *
> * at
> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934)
> *
> * at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469)