Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Use of virtual columns in joins


+
Peter Marron 2013-06-10, 08:57
Copy link to this message
-
Re: Use of virtual columns in joins
You might be hitting into
https://issues.apache.org/jira/browse/HIVE-4033in which case its
recommended that you upgrade to 0.11 where in this bug is
fixed.
On Mon, Jun 10, 2013 at 1:57 AM, Peter Marron <
[EMAIL PROTECTED]> wrote:

>  Hi,****
>
> ** **
>
> I’m using hive 0.10.0 over hadoop 1.0.4.****
>
> ** **
>
> I have created a couple of test tables and found that  various join queries
> ****
>
> that refer to virtual columns fail. For example the query:****
>
> ** **
>
> SELECT * FROM a JOIN b ON b.rownumber = a.number;****
>
> ** **
>
> works but the following three queries all fail.****
>
> ** **
>
> SELECT *,a.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber > a.number;****
>
> SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber > a.number;****
>
> SELECT * FROM a JOIN b ON b.offset = a.BLOCK__OFFSET__INSIDE__FILE;****
>
> ** **
>
> They all fail in the same way, but I am too much of a newb to be able to**
> **
>
> tell much from the error message:****
>
> ** **
>
> Error during job, obtaining debugging information...****
>
> Execution failed with exit status: 2****
>
> Obtaining error information****
>
> ** **
>
> Task failed!****
>
> Task ID:****
>
>   Stage-1****
>
> ** **
>
> Logs:****
>
> ** **
>
> /tmp/pmarron/hive.log****
>
> ** **
>
> When I look in the log I can find this:****
>
> ** **
>
> 2013-06-07 14:06:22,831 WARN  mapred.LocalJobRunner
> (LocalJobRunner.java:run(298)) - job_local_0001****
>
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
> processing writable 1,0****
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)****
>
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)****
>
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
> ****
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)****
>
>         at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)**
> **
>
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
> Error while processing writable 1,0****
>
>         at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:539)**
> **
>
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)****
>
>         ... 4 more****
>
> Caused by: java.lang.NullPointerException****
>
>         at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:516)**
> **
>
>         ... 5 more****
>
> ** **
>
> and I’ve looked at the source code referred to, but it doesn’t mean much
> to me, I’m afraid.****
>
> ** **
>
> For completeness here’s a description of the tables:****
>
> ** **
>
>     > select * from a;****
>
> OK****
>
> first        1              primo****
>
> second 2              secondo****
>
> third      3              terzo****
>
> fourth   4              quarto****
>
> fifth       5              quinto****
>
> sitxh      6              sesto****
>
> seventh               7              settimo****
>
> eigthth 8              ottavo****
>
> ninth     9              nono****
>
> tenth     10           decimo****
>
> Time taken: 0.105 seconds****
>
> hive> describe extended a;****
>
> OK****
>
> english  string     ****
>
> number                bigint     ****
>
> italian    string     ****
>
> ** **
>
> hive> select * from b;****
>
> OK****
>
> 1              0****
>
> 2              14****
>
> 3              31****
>
> 4              45****
>
> 5              61****
>
> 6              77****
>
> 7              91****
>
> 8              109****
>
> 9              126****
>
> 10           139****
>
> Time taken: 0.067 seconds****
>
> hive> describe  b;****
>
> OK****
>
> rownumber        bigint     ****
>
> offset    bigint     ****
>
> Time taken: 0.072 seconds****
>
> hive>****
>
> ** **
>
> These queries aren’t actually important to me, as I am taking a different
> approach.****
>
> But I thought that it might be important to mention these failures if they
+
Peter Marron 2013-06-25, 09:56
+
Navis류승우 2013-06-26, 07:10