Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
well...   the hiveException has the word "metadata" in it.  maybe that's a
hint or a red-herrring. :)    Let's try the following:

1.  show create table * facts520_normal_text;

*
*2.  anything useful at this URL? **
http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002or
is it just the same stack dump?
*
On Wed, Jun 5, 2013 at 3:17 AM, Sachin Sudarshana
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> I have hive 0.10 + (CDH 4.2.1 patches) installed on my cluster.
>
> I have a table facts520_normal_text stored as a textfile. I'm trying to
> create a compressed table from this table using GZip codec.
>
> *hive> SET hive.exec.compress.output=true;*
> *hive> SET
> mapred.output.compression.codec=org.apache.hadoop.io.compress.GZipCodec;*
> *hive> SET mapred.output.compression.type=BLOCK;*
> *
> *
> *hive>*
> *    > Create table facts520_gzip_text*
> *    > (fact_key BIGINT,*
> *    > products_key INT,*
> *    > retailers_key INT,*
> *    > suppliers_key INT,*
> *    > time_key INT,*
> *    > units INT)*
> *    > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','*
> *    > LINES TERMINATED BY '\n'*
> *    > STORED AS TEXTFILE;*
> *
> *
> *hive> INSERT OVERWRITE TABLE facts520_gzip_text SELECT * from
> facts520_normal_text;*
>
>
> When I run the above queries, the MR job fails.
>
> The error that the Hive CLI itself shows is the following:
>
> *Total MapReduce jobs = 3*
> *Launching Job 1 out of 3*
> *Number of reduce tasks is set to 0 since there's no reduce operator*
> *Starting Job = job_201306051948_0010, Tracking URL > http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010*
> *Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill
> job_201306051948_0010*
> *Hadoop job information for Stage-1: number of mappers: 3; number of
> reducers: 0*
> *2013-06-05 21:09:42,281 Stage-1 map = 0%,  reduce = 0%*
> *2013-06-05 21:10:11,446 Stage-1 map = 100%,  reduce = 100%*
> *Ended Job = job_201306051948_0010 with errors*
> *Error during job, obtaining debugging information...*
> *Job Tracking URL:
> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010*
> *Examining task ID: task_201306051948_0010_m_000004 (and more) from job
> job_201306051948_0010*
> *Examining task ID: task_201306051948_0010_m_000001 (and more) from job
> job_201306051948_0010*
> *
> *
> *Task with the most failures(4):*
> *-----*
> *Task ID:*
> *  task_201306051948_0010_m_000002*
> *
> *
> *URL:*
> *
> http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002
> *
> *-----*
> *Diagnostic Messages for this Task:*
> *java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing row
> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23}
> *
> *        at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)*
> *        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
> *        at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
> *        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
> *        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
> *        at java.security.AccessController.doPrivileged(Native Method)*
> *        at javax.security.auth.Subject.doAs(Subject.java:415)*
> *        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> *
> *        at org.apache.hadoop.mapred.Child.main(Child.java:262)*
> *Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
> Runtime Error while processing row
> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23}
> *
> *        at org.apach*
> *
> *
> *FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask*
> *MapReduce Jobs Launched:*
> *Job 0: Map: 3   HDFS Read: 0 HDFS Write: 0 FAIL*
> *Total MapReduce CPU Time Spent: 0 msec*
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB