Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
aha!   All's well that ends well then! :)
On Thu, Jun 6, 2013 at 9:49 AM, Sachin Sudarshana
<[EMAIL PROTECTED]>wrote:

> Hi Stephen,
>
> Thank you for your reply.
>
> But, its the silliest error from my side. Its a typo!
>
> The codec is : org.apache.hadoop.io.compress.*GzipCodec* and not
> org.apache.hadoop.io.compress.*GZipCodec.*
> *
> *
> I regret making that mistake.
>
> Thank you,
> Sachin
>
>
> On Thu, Jun 6, 2013 at 10:07 PM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>
>> Hi Sachin,
>> LIke you say looks like something to do with the GZipCodec all right. And
>> that would make sense given your original problem.
>>
>> Yeah, one would think it'd be in there by default but for whatever reason
>> its not finding it but at least the problem is now identified.
>>
>> Now _my guess_ is that maybe your hadoop core-site.xml file might need to
>> list the codecs available under the property name:
>> "io.compression.codecs".  Can you chase that up as a possibility and let us
>> know what you find out?
>>
>>
>>
>>
>> On Thu, Jun 6, 2013 at 4:02 AM, Sachin Sudarshana <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi Stephen,
>>>
>>> *hive> show create table facts520_normal_text;*
>>> *OK*
>>> *CREATE  TABLE facts520_normal_text(*
>>> *  fact_key bigint,*
>>>  *  products_key int,*
>>> *  retailers_key int,*
>>> *  suppliers_key int,*
>>> *  time_key int,*
>>> *  units int)*
>>> *ROW FORMAT DELIMITED*
>>> *  FIELDS TERMINATED BY ','*
>>> *  LINES TERMINATED BY '\n'*
>>> *STORED AS INPUTFORMAT*
>>> *  'org.apache.hadoop.mapred.TextInputFormat'*
>>> *OUTPUTFORMAT*
>>> *  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'*
>>> *LOCATION*
>>> *  'hdfs://
>>> aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'*
>>> *TBLPROPERTIES (*
>>> *  'numPartitions'='0',*
>>> *  'numFiles'='1',*
>>> *  'transient_lastDdlTime'='1369395430',*
>>> *  'numRows'='0',*
>>> *  'totalSize'='545216508',*
>>> *  'rawDataSize'='0')*
>>> *Time taken: 0.353 seconds*
>>>
>>>
>>> The syserror log shows this:
>>>
>>> *java.lang.IllegalArgumentException: Compression codec
>>> org.apache.hadoop.io.compress.GZipCodec was not found.*
>>> * at
>>> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)*
>>> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
>>> * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
>>> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
>>> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
>>> * at java.security.AccessController.doPrivileged(Native Method)*
>>> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>>> * at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>>> *
>>> * at org.apache.hadoop.mapred.Child.main(Child.java:262)*
>>> *Caused by: java.lang.ClassNotFoundException: Class