Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
aha!   All's well that ends well then! :)
On Thu, Jun 6, 2013 at 9:49 AM, Sachin Sudarshana
<[EMAIL PROTECTED]>wrote:

> Hi Stephen,
>
> Thank you for your reply.
>
> But, its the silliest error from my side. Its a typo!
>
> The codec is : org.apache.hadoop.io.compress.*GzipCodec* and not
> org.apache.hadoop.io.compress.*GZipCodec.*
> *
> *
> I regret making that mistake.
>
> Thank you,
> Sachin
>
>
> On Thu, Jun 6, 2013 at 10:07 PM, Stephen Sprague <[EMAIL PROTECTED]>wrote:
>
>> Hi Sachin,
>> LIke you say looks like something to do with the GZipCodec all right. And
>> that would make sense given your original problem.
>>
>> Yeah, one would think it'd be in there by default but for whatever reason
>> its not finding it but at least the problem is now identified.
>>
>> Now _my guess_ is that maybe your hadoop core-site.xml file might need to
>> list the codecs available under the property name:
>> "io.compression.codecs".  Can you chase that up as a possibility and let us
>> know what you find out?
>>
>>
>>
>>
>> On Thu, Jun 6, 2013 at 4:02 AM, Sachin Sudarshana <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi Stephen,
>>>
>>> *hive> show create table facts520_normal_text;*
>>> *OK*
>>> *CREATE  TABLE facts520_normal_text(*
>>> *  fact_key bigint,*
>>>  *  products_key int,*
>>> *  retailers_key int,*
>>> *  suppliers_key int,*
>>> *  time_key int,*
>>> *  units int)*
>>> *ROW FORMAT DELIMITED*
>>> *  FIELDS TERMINATED BY ','*
>>> *  LINES TERMINATED BY '\n'*
>>> *STORED AS INPUTFORMAT*
>>> *  'org.apache.hadoop.mapred.TextInputFormat'*
>>> *OUTPUTFORMAT*
>>> *  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'*
>>> *LOCATION*
>>> *  'hdfs://
>>> aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'*
>>> *TBLPROPERTIES (*
>>> *  'numPartitions'='0',*
>>> *  'numFiles'='1',*
>>> *  'transient_lastDdlTime'='1369395430',*
>>> *  'numRows'='0',*
>>> *  'totalSize'='545216508',*
>>> *  'rawDataSize'='0')*
>>> *Time taken: 0.353 seconds*
>>>
>>>
>>> The syserror log shows this:
>>>
>>> *java.lang.IllegalArgumentException: Compression codec
>>> org.apache.hadoop.io.compress.GZipCodec was not found.*
>>> * at
>>> org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469)
>>> *
>>> * at
>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)*
>>> * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)*
>>> * at
>>> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)
>>> *
>>> * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)*
>>> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
>>> * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
>>> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
>>> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
>>> * at java.security.AccessController.doPrivileged(Native Method)*
>>> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>>> * at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>>> *
>>> * at org.apache.hadoop.mapred.Child.main(Child.java:262)*
>>> *Caused by: java.lang.ClassNotFoundException: Class
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB