Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> hive 0.11 auto convert join bug report


Copy link to this message
-
Re: 回复: hive 0.11 auto convert join bug report
Hi,

Hive is notorious making different result with different aliases.
Changing alias was a final way to avoid bug in desperate situation.

I think the patch in the issue is ready, wish it's helpful.

Thanks.

2013/8/11  <[EMAIL PROTECTED]>:
> Hi Navis,
>
> My colleague chenchun finds that hashcode of 'deal' and 'dim_pay_date' are
> the same and the code in MapJoinProcessor.java ignores the order of
> rowschema.
> I look at your patch and it's exactly the same place we are working on.
> Thanks for your patch.
>
> 在 2013年8月11日星期日,下午9:38,Navis류승우 写道:
>
> Hi,
>
> I've booked this on https://issues.apache.org/jira/browse/HIVE-5056
> and attached patch for it.
>
> It needs full test for confirmation but you can try it.
>
> Thanks.
>
> 2013/8/11 <[EMAIL PROTECTED]>:
>
> Hi all:
> when I change the table alias dim_pay_date to A, the query pass in hive
> 0.11(https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_change_alias_pass):
>
> use test;
> create table if not exists src ( `key` int,`val` string);
> load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite
> into table src;
> drop table if exists orderpayment_small;
> create table orderpayment_small (`dealid` int,`date` string,`time` string,
> `cityid` int, `userid` int);
> insert overwrite table orderpayment_small select 748, '2011-03-24',
> '2011-03-24', 55 ,5372613 from src limit 1;
> drop table if exists user_small;
> create table user_small( userid int);
> insert overwrite table user_small select key from src limit 100;
> set hive.auto.convert.join.noconditionaltask.size = 200;
> SELECT
> `A`.`date`
> , `deal`.`dealid`
> FROM `orderpayment_small` `orderpayment`
> JOIN `orderpayment_small` `A` ON `A`.`date` = `orderpayment`.`date`
> JOIN `orderpayment_small` `deal` ON `deal`.`dealid` > `orderpayment`.`dealid`
> JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` > `orderpayment`.`cityid`
> JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
> limit 5;
>
>
> It's quite strange and interesting now. I will keep searching for the answer
> to this issue.
>
>
>
> 在 2013年8月9日星期五,上午3:32,[EMAIL PROTECTED] 写道:
>
> Hi all:
> I'm currently testing hive11 and encounter one bug with
> hive.auto.convert.join, I construct a testcase so everyone can reproduce
> it(or you can reach the testcase
> here:https://gist.github.com/code6/6187569#file-hive11_auto_convert_join_bug):
>
> use test;
> create table src ( `key` int,`val` string);
> load data local inpath '/Users/code6/git/hive/data/files/kv1.txt' overwrite
> into table src;
> drop table if exists orderpayment_small;
> create table orderpayment_small (`dealid` int,`date` string,`time` string,
> `cityid` int, `userid` int);
> insert overwrite table orderpayment_small select 748, '2011-03-24',
> '2011-03-24', 55 ,5372613 from src limit 1;
> drop table if exists user_small;
> create table user_small( userid int);
> insert overwrite table user_small select key from src limit 100;
> set hive.auto.convert.join.noconditionaltask.size = 200;
> SELECT
> `dim_pay_date`.`date`
> , `deal`.`dealid`
> FROM `orderpayment_small` `orderpayment`
> JOIN `orderpayment_small` `dim_pay_date` ON `dim_pay_date`.`date` > `orderpayment`.`date`
> JOIN `orderpayment_small` `deal` ON `deal`.`dealid` > `orderpayment`.`dealid`
> JOIN `orderpayment_small` `order_city` ON `order_city`.`cityid` > `orderpayment`.`cityid`
> JOIN `user_small` `user` ON `user`.`userid` = `orderpayment`.`userid`
> limit 5;
>
>
> You should replace the path of kv1.txt by yourself. You can run the above
> query in hive 0.11 and it will fail with ArrayIndexOutOfBoundsException, You
> can see the explain result and the console output of the query here :
> https://gist.github.com/code6/6187569
>
> I compile the trunk code but it doesn't work with this query. I can run this
> query in hive 0.9 with hive.auto.convert.join turns on.
>
> I try to dig into this problem and I think it may be caused by the map join
> optimization. Some adjacent operators aren't match for the input/output
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB