Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Use of virtual columns in joins


Copy link to this message
-
Use of virtual columns in joins
Hi,

I'm using hive 0.10.0 over hadoop 1.0.4.

I have created a couple of test tables and found that  various join queries
that refer to virtual columns fail. For example the query:

SELECT * FROM a JOIN b ON b.rownumber = a.number;

works but the following three queries all fail.

SELECT *,a.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number;
SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number;
SELECT * FROM a JOIN b ON b.offset = a.BLOCK__OFFSET__INSIDE__FILE;

They all fail in the same way, but I am too much of a newb to be able to
tell much from the error message:

Error during job, obtaining debugging information...
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
  Stage-1

Logs:

/tmp/pmarron/hive.log

When I look in the log I can find this:

2013-06-07 14:06:22,831 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(298)) - job_local_0001
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable 1,0
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable 1,0
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
        ... 4 more
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)
        ... 5 more

and I've looked at the source code referred to, but it doesn't mean much to me, I'm afraid.

For completeness here's a description of the tables:

    > select * from a;
OK
first        1              primo
second 2              secondo
third      3              terzo
fourth   4              quarto
fifth       5              quinto
sitxh      6              sesto
seventh               7              settimo
eigthth 8              ottavo
ninth     9              nono
tenth     10           decimo
Time taken: 0.105 seconds
hive> describe extended a;
OK
english  string
number                bigint
italian    string

hive> select * from b;
OK
1              0
2              14
3              31
4              45
5              61
6              77
7              91
8              109
9              126
10           139
Time taken: 0.067 seconds
hive> describe  b;
OK
rownumber        bigint
offset    bigint
Time taken: 0.072 seconds
hive>

These queries aren't actually important to me, as I am taking a different approach.
But I thought that it might be important to mention these failures if they expose
a bug. Or maybe I'll learn that I'm doing something and there's a way to get these
joins to work...

Regards,

Z

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB