Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Use of virtual columns in joins


Copy link to this message
-
Use of virtual columns in joins
Hi,

I'm using hive 0.10.0 over hadoop 1.0.4.

I have created a couple of test tables and found that  various join queries
that refer to virtual columns fail. For example the query:

SELECT * FROM a JOIN b ON b.rownumber = a.number;

works but the following three queries all fail.

SELECT *,a.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number;
SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number;
SELECT * FROM a JOIN b ON b.offset = a.BLOCK__OFFSET__INSIDE__FILE;

They all fail in the same way, but I am too much of a newb to be able to
tell much from the error message:

Error during job, obtaining debugging information...
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-1

Logs:

/tmp/pmarron/hive.log

When I look in the log I can find this:

2013-06-07 14:06:22,831 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(298)) - job_local_0001
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable 1,0
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable 1,0
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
        ... 4 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)
        ... 5 more

and I've looked at the source code referred to, but it doesn't mean much to me, I'm afraid.

For completeness here's a description of the tables:

    > select * from a;
OK
first        1              primo
second 2              secondo
third      3              terzo
fourth   4              quarto
fifth       5              quinto
sitxh      6              sesto
seventh               7              settimo
eigthth 8              ottavo
ninth     9              nono
tenth     10           decimo
Time taken: 0.105 seconds
hive> describe extended a;
OK
english  string
number                bigint
italian    string

hive> select * from b;
OK
1              0
2              14
3              31
4              45
5              61
6              77
7              91
8              109
9              126
10           139
Time taken: 0.067 seconds
hive> describe  b;
OK
rownumber        bigint
offset    bigint
Time taken: 0.072 seconds
hive>

These queries aren't actually important to me, as I am taking a different approach.
But I thought that it might be important to mention these failures if they expose
a bug. Or maybe I'll learn that I'm doing something and there's a way to get these
joins to work...

Regards,

Z