Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - different outer join plan between hive 0.9 and hive 0.10


Copy link to this message
-
Re: different outer join plan between hive 0.9 and hive 0.10
Navis류승우 2013-03-28, 05:47
It's a bug (https://issues.apache.org/jira/browse/HIVE-4206).

Thanks for reporting it.

2013/3/24 wzc <[EMAIL PROTECTED]>:
> Recently we tried to upgrade our hive from 0.9 to 0.10, but found some of
> our hive queries almost 7 times slow.  One of such query consists multiple
> table outer join on the same key. By looking into the query,  we found the
> query plans generate by hive 0.9 and hive 0.10 are different.  Here is the
> example:
>
> testcase:
>
> use default;
> create table test_join (
>   `key` string,
>   `value` string
> );
>
> explain select
>     sum(a.value) val
> from default.test_join a
> left outer join default.test_join b on a.key = b.key
> left outer join default.test_join c on a.key = c.key
> left outer join default.test_join d on a.key = d.key
> left outer join default.test_join e on a.key = e.key
> left outer join default.test_join f on a.key = f.key
> left outer join default.test_join g on a.key = g.key
>
>
> the explain of hive 0.9:
>
>  STAGE DEPENDENCIES:
>
>     Stage-1 is a root stage
>
>     Stage-2 depends on stages: Stage-1
>
>     Stage-0 is a root stage
>
> ...
>
> Reduce Operator Tree:
>
>           Join Operator
>
>             condition map:
>
>                  Left Outer Join0 to 1
>
>                  Left Outer Join0 to 2
>
>                  Left Outer Join0 to 3
>
>                  Left Outer Join0 to 4
>
>                  Left Outer Join0 to 5
>
>                  Left Outer Join0 to 6
>
>             condition expressions:
>
>               0 {VALUE._col1}
>
>               1
>
>               2
>
>               3
>
>               4
>
>               5
>
>               6
>
> ......
>
>
> while the explain of hive 0.10:
>
>   STAGE DEPENDENCIES:
>
>     Stage-6 is a root stage
>
>     Stage-1 depends on stages: Stage-6
>
>     Stage-2 depends on stages: Stage-1
>
>     Stage-0 is a root stage
>
> ...
>
>      Reduce Operator Tree:
>
>           Join Operator
>
>             condition map:
>
>                  Left Outer Join0 to 1
>
>                  Left Outer Join0 to 2
>
>             condition expressions:
>
>               0 {VALUE._col0} {VALUE._col1}
>
>               1
>
>               2
>
> ...
>
>      Reduce Operator Tree:
>
>           Join Operator
>
>             condition map:
>
>                  Left Outer Join0 to 1
>
>                  Left Outer Join0 to 2
>
>                  Left Outer Join0 to 3
>
>                  Left Outer Join0 to 4
>
>             condition expressions:
>
>               0 {VALUE._col9}
>
>               1
>
>               2
>
>               3
>
>               4
>
> ....
>
>
> It seems like hive 0.9 use only one stage/job to process all outer joins but
> hive 0.10 split them into two stage.  When running such kind of query on
> hive0.10 in production,  in the second stage of outer join process, some
> reducer stucks.
>
> I can't find any param to change the query plain ,  can anyone give me some
> hint?
>
> Thanks!
>
>
>
>