Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> different outer join plan between hive 0.9 and hive 0.10


+
wzc 2013-03-24, 05:03
+
Navis류승우 2013-03-28, 05:47
Copy link to this message
-
Re: different outer join plan between hive 0.9 and hive 0.10
The bug remains even if I apply the patch in HIVE-4206 :(  The explain
result hasn't change.
2013/3/28 Navis류승우 <[EMAIL PROTECTED]>

> It's a bug (https://issues.apache.org/jira/browse/HIVE-4206).
>
> Thanks for reporting it.
>
> 2013/3/24 wzc <[EMAIL PROTECTED]>:
> > Recently we tried to upgrade our hive from 0.9 to 0.10, but found some of
> > our hive queries almost 7 times slow.  One of such query consists
> multiple
> > table outer join on the same key. By looking into the query,  we found
> the
> > query plans generate by hive 0.9 and hive 0.10 are different.  Here is
> the
> > example:
> >
> > testcase:
> >
> > use default;
> > create table test_join (
> >   `key` string,
> >   `value` string
> > );
> >
> > explain select
> >     sum(a.value) val
> > from default.test_join a
> > left outer join default.test_join b on a.key = b.key
> > left outer join default.test_join c on a.key = c.key
> > left outer join default.test_join d on a.key = d.key
> > left outer join default.test_join e on a.key = e.key
> > left outer join default.test_join f on a.key = f.key
> > left outer join default.test_join g on a.key = g.key
> >
> >
> > the explain of hive 0.9:
> >
> >  STAGE DEPENDENCIES:
> >
> >     Stage-1 is a root stage
> >
> >     Stage-2 depends on stages: Stage-1
> >
> >     Stage-0 is a root stage
> >
> > ...
> >
> > Reduce Operator Tree:
> >
> >           Join Operator
> >
> >             condition map:
> >
> >                  Left Outer Join0 to 1
> >
> >                  Left Outer Join0 to 2
> >
> >                  Left Outer Join0 to 3
> >
> >                  Left Outer Join0 to 4
> >
> >                  Left Outer Join0 to 5
> >
> >                  Left Outer Join0 to 6
> >
> >             condition expressions:
> >
> >               0 {VALUE._col1}
> >
> >               1
> >
> >               2
> >
> >               3
> >
> >               4
> >
> >               5
> >
> >               6
> >
> > ......
> >
> >
> > while the explain of hive 0.10:
> >
> >   STAGE DEPENDENCIES:
> >
> >     Stage-6 is a root stage
> >
> >     Stage-1 depends on stages: Stage-6
> >
> >     Stage-2 depends on stages: Stage-1
> >
> >     Stage-0 is a root stage
> >
> > ...
> >
> >      Reduce Operator Tree:
> >
> >           Join Operator
> >
> >             condition map:
> >
> >                  Left Outer Join0 to 1
> >
> >                  Left Outer Join0 to 2
> >
> >             condition expressions:
> >
> >               0 {VALUE._col0} {VALUE._col1}
> >
> >               1
> >
> >               2
> >
> > ...
> >
> >      Reduce Operator Tree:
> >
> >           Join Operator
> >
> >             condition map:
> >
> >                  Left Outer Join0 to 1
> >
> >                  Left Outer Join0 to 2
> >
> >                  Left Outer Join0 to 3
> >
> >                  Left Outer Join0 to 4
> >
> >             condition expressions:
> >
> >               0 {VALUE._col9}
> >
> >               1
> >
> >               2
> >
> >               3
> >
> >               4
> >
> > ....
> >
> >
> > It seems like hive 0.9 use only one stage/job to process all outer joins
> but
> > hive 0.10 split them into two stage.  When running such kind of query on
> > hive0.10 in production,  in the second stage of outer join process, some
> > reducer stucks.
> >
> > I can't find any param to change the query plain ,  can anyone give me
> some
> > hint?
> >
> > Thanks!
> >
> >
> >
> >
>
+
Navis류승우 2013-03-29, 01:34
+
wzc1989 2013-05-13, 16:11
+
Navis류승우 2013-05-14, 04:26
+
wzc1989 2013-05-14, 05:06
+
Navis류승우 2013-05-14, 06:17
+
wzc1989 2013-07-01, 07:15
+
Navis류승우 2013-07-02, 00:24
+
wzc1989 2013-07-03, 15:27