Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive JOIN fails if SELECT statement contains fields from the first table.


Copy link to this message
-
Hive JOIN fails if SELECT statement contains fields from the first table.
Bing Li 2012-01-16, 10:06
1. I create two Hive table:
Hive> CREATE EXTERNAL TABLE student_details (studentid INT,studentname
STRING,age INT,gpa FLOAT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE LOCATION  ‘/home/biadmin/hivetbl';

Hive>CREATE EXTERNAL TABLE student_score(studentid INT, classid INT,score
FLOAT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE
LOCATION '/home/biadmin/hivetbl';

2. Load data
HIVE>LOAD DATA LOCAL INPATH '/home/biadmin/hivetbl/student_details.txt'
OVERWRITE INTO TABLE student_details;

HIVE>LOAD DATA LOCAL INPATH '/home/biadmin/hivetbl/student_score.txt'
OVERWRITE INTO TABLE student_score;

3. Run inner join
Hive> SELECT a.studentid,a.studentname,a.age,b.classid,b.score,c.classname
FROM student_details a JOIN student_score b ON (a.studentid = b.studentid);

Result:
There are the following exception:
cannot find field studentname from [0:studentid, 1:classid, 2:score]

[My Question]: studentname is a field of the table student_details (The
first table), why search it in the table student_score(the second table)?

log is like that;
... ...
2012-01-15 23:24:41,727 INFO org.apache.hadoop.mapred.TaskInProgress: Error
from attempt_201201152221_0014_m_000000_3: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row {"studentid":106,"classid":null,"score":635.0}
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row {"studentid":106,"classid":null,"score":635.0}
        at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550)
        at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
        ... 4 more
Caused by: java.lang.RuntimeException: cannot find field studentname from
[0:studentid, 1:classid, 2:score]
        at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345)
        at
org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
        at
org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
        at
org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896)
        at
org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922)
        at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:200)
        at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
        at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
        at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
        at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531)
        ... 5 more