Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Textfile compression using Gzip codec


Copy link to this message
-
Re: Textfile compression using Gzip codec
Stephen Sprague 2013-06-05, 18:28
well...   the hiveException has the word "metadata" in it.  maybe that's a
hint or a red-herrring. :)    Let's try the following:

1.  show create table * facts520_normal_text;

*
*2.  anything useful at this URL? **
http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002or
is it just the same stack dump?
*
On Wed, Jun 5, 2013 at 3:17 AM, Sachin Sudarshana
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> I have hive 0.10 + (CDH 4.2.1 patches) installed on my cluster.
>
> I have a table facts520_normal_text stored as a textfile. I'm trying to
> create a compressed table from this table using GZip codec.
>
> *hive> SET hive.exec.compress.output=true;*
> *hive> SET
> mapred.output.compression.codec=org.apache.hadoop.io.compress.GZipCodec;*
> *hive> SET mapred.output.compression.type=BLOCK;*
> *
> *
> *hive>*
> *    > Create table facts520_gzip_text*
> *    > (fact_key BIGINT,*
> *    > products_key INT,*
> *    > retailers_key INT,*
> *    > suppliers_key INT,*
> *    > time_key INT,*
> *    > units INT)*
> *    > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','*
> *    > LINES TERMINATED BY '\n'*
> *    > STORED AS TEXTFILE;*
> *
> *
> *hive> INSERT OVERWRITE TABLE facts520_gzip_text SELECT * from
> facts520_normal_text;*
>
>
> When I run the above queries, the MR job fails.
>
> The error that the Hive CLI itself shows is the following:
>
> *Total MapReduce jobs = 3*
> *Launching Job 1 out of 3*
> *Number of reduce tasks is set to 0 since there's no reduce operator*
> *Starting Job = job_201306051948_0010, Tracking URL > http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010*
> *Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill
> job_201306051948_0010*
> *Hadoop job information for Stage-1: number of mappers: 3; number of
> reducers: 0*
> *2013-06-05 21:09:42,281 Stage-1 map = 0%,  reduce = 0%*
> *2013-06-05 21:10:11,446 Stage-1 map = 100%,  reduce = 100%*
> *Ended Job = job_201306051948_0010 with errors*
> *Error during job, obtaining debugging information...*
> *Job Tracking URL:
> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010*
> *Examining task ID: task_201306051948_0010_m_000004 (and more) from job
> job_201306051948_0010*
> *Examining task ID: task_201306051948_0010_m_000001 (and more) from job
> job_201306051948_0010*
> *
> *
> *Task with the most failures(4):*
> *-----*
> *Task ID:*
> *  task_201306051948_0010_m_000002*
> *
> *
> *URL:*
> *
> http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002
> *
> *-----*
> *Diagnostic Messages for this Task:*
> *java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing row
> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23}
> *
> *        at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)*
> *        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)*
> *        at
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)*
> *        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)*
> *        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)*
> *        at java.security.AccessController.doPrivileged(Native Method)*
> *        at javax.security.auth.Subject.doAs(Subject.java:415)*
> *        at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> *
> *        at org.apache.hadoop.mapred.Child.main(Child.java:262)*
> *Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
> Runtime Error while processing row
> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23}
> *
> *        at org.apach*
> *
> *
> *FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.MapRedTask*
> *MapReduce Jobs Launched:*
> *Job 0: Map: 3   HDFS Read: 0 HDFS Write: 0 FAIL*
> *Total MapReduce CPU Time Spent: 0 msec*