Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
Hi Stephen,

Thank you for your reply.

But, its the silliest error from my side. Its a typo!

The codec is : org.apache.hadoop.io.compress.*GzipCodec* and not
org.apache.hadoop.io.compress.*GZipCodec.*
*
*
I regret making that mistake.

Thank you,
Sachin
On Thu, Jun 6, 2013 at 10:07 PM, Stephen Sprague <[EMAIL PROTECTED]> wrote:

> Hi Sachin,
> LIke you say looks like something to do with the GZipCodec all right. And
> that would make sense given your original problem.
>
> Yeah, one would think it'd be in there by default but for whatever reason
> its not finding it but at least the problem is now identified.
>
> Now _my guess_ is that maybe your hadoop core-site.xml file might need to
> list the codecs available under the property name:
> "io.compression.codecs".  Can you chase that up as a possibility and let us
> know what you find out?
>
>
>
>
> On Thu, Jun 6, 2013 at 4:02 AM, Sachin Sudarshana <[EMAIL PROTECTED]
> > wrote:
>
>> Hi Stephen,
>>
>> *hive> show create table facts520_normal_text;*
>> *OK*
>> *CREATE  TABLE facts520_normal_text(*
>> *  fact_key bigint,*
>>  *  products_key int,*
>> *  retailers_key int,*
>> *  suppliers_key int,*
>> *  time_key int,*
>> *  units int)*
>> *ROW FORMAT DELIMITED*
>> *  FIELDS TERMINATED BY ','*
>> *  LINES TERMINATED BY '\n'*
>> *STORED AS INPUTFORMAT*
>> *  'org.apache.hadoop.mapred.TextInputFormat'*
>> *OUTPUTFORMAT*
>> *  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'*
>> *LOCATION*
>> *  'hdfs://
>> aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'*
>> *TBLPROPERTIES (*
>> *  'numPartitions'='0',*
>> *  'numFiles'='1',*
>> *  'transient_lastDdlTime'='1369395430',*
>> *  'numRows'='0',*
>> *  'totalSize'='545216508',*
>> *  'rawDataSize'='0')*
>> *Time taken: 0.353 seconds*
>>
>>
>> The syserror log shows this:
>>
>> *java.lang.IllegalArgumentException: Compression codec
>> org.apache.hadoop.io.compress.GZipCodec was not found.*
>> * at
>> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85)
>> *
>> * at
>> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934)
>> *
>> * at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469)
>> *
>> * at
>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543)
>> *
>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>> * at
>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>> *
>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>> * at
>> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
>> *
>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>> * at
>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)*
>> * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)*
>> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
>> * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
>> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
>> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
>> * at java.security.AccessController.doPrivileged(Native Method)*
>> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>> * at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>> *
>> * at org.apache.hadoop.mapred.Child.main(Child.java:262)*
>> *Caused by: java.lang.ClassNotFoundException: Class
>> org.apache.hadoop.io.compress.GZipCodec not found*
>> * at
>> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493)
>> *
>> * at
>> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82)