Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> AW: Hive query problem on S3 table


+
Tim Bittersohl 2013-04-19, 09:36
Copy link to this message
-
AW: Hive query problem on S3 table
The following thing I already tried...

 

 

In the internet I found the hint to set the this configuration, to solve the
problem:

 

hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat

 

But I just get a RuntimeException doing so:

 

java.lang.RuntimeException: org.apache.hadoop.hive.ql.io.HiveInputFormat

                at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:333)

                at
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)

                at
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

                at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)

                at
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352)

                at
org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1138)

                at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951)

                at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServ
er.java:198)

                at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift
Hive.java:644)

                at
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(Thrift
Hive.java:628)

                at
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

                at
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

                at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServ
er.java:206)

                at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11
45)

                at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6
15)

                at java.lang.Thread.run(Thread.java:722)

13/04/18 15:37:14 ERROR exec.ExecDriver: Exception:
org.apache.hadoop.hive.ql.io.HiveInputFormat              

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask

13/04/18 15:37:14 ERROR ql.Driver: FAILED: Execution Error, return code 1
from org.apache.hadoop.hive.ql.exec.MapRedTask

 

 

 

 

Von: shrikanth shankar [mailto:[EMAIL PROTECTED]]
Gesendet: Donnerstag, 18. April 2013 17:32
An: [EMAIL PROTECTED]
Betreff: Re: Hive query problem on S3 table

 

Tim,

  Could you try doing

set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

before running the query?

 

Shrikanth

On Apr 18, 2013, at 8:09 AM, Tim Bittersohl wrote:

Thanks for your answer, I tested the program with an S3N setup and
unfortunately got the same error behavior...

 

 

Von: Dean Wampler [mailto:[EMAIL PROTECTED]]
Gesendet: Donnerstag, 18. April 2013 16:25
An: [EMAIL PROTECTED]
Betreff: Re: Hive query problem on S3 table

 

I'm not sure what's happening here, but one suggestion; use s3n://...
instead of s3://... The "new" version is supposed to provide better
performance.

 

dean

 

On Thu, Apr 18, 2013 at 8:43 AM, Tim Bittersohl <[EMAIL PROTECTED]> wrote:

Hi,

 

I just found out, that I don't have to change the default file system of
Hadoop.

The location in the create table command has just to be changed:

 

CREATE EXTERNAL TABLE testtable(nyseVal STRING, cliVal STRING, dateVal
STRING, number1Val STRING)

ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t'

LINES TERMINATED BY '\\n'

STORED AS TextFile LOCATION "s3://hadoop-bucket/data/
<x-msg://6640/s3://hadoop-bucket/data/> "

 

 

But when I try to access the table with a command that creates a Hadoop job,
I get the following error:

 

13/04/18 15:29:36 ERROR security.UserGroupInformation:
PriviledgedActionException as:tim (auth:SIMPLE)
cause:java.io.FileNotFoundException: File does not exist:
/data/NYSE_daily.txt

java.io.FileNotFoundException: File does not exist: /data/NYSE_daily.txt

                at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSy
stem.java:807)

                at
org.apache.hadoop.mapred.lib.CombineFileInputFormat$OneFileInfo.<init>(Combi
neFileInputFormat.java:462)

                at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getMoreSplits(CombineFil
eInputFormat.java:256)

                at
org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInp
utFormat.java:212)

                at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge
tSplits(HadoopShimsSecure.java:411)

                at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.ge
tSplits(HadoopShimsSecure.java:377)

                at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInp
utFormat.java:387)

                at
org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1091)

                at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1083)

                at
org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174)

                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:993)

                at
org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:946)

                at java.security.AccessController.doPrivileged(Native
Method)

                at javax.security.auth.Subject.doAs(Subject.java:415)

                at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.ja
va:1408)

                at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:946)

                at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:920)

                at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)

                at
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)

                at
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)

                at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)

                at
org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1352)

                a
+
Tim Bittersohl 2013-04-18, 15:09
+
Tim Bittersohl 2013-04-18, 14:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB