|
|
+
Zhong Wang 2013-02-06, 13:28
-
Re: A bug of auto convert join with intermediate table?Abdelrhman Shettia 2013-02-06, 20:40
Hi Zhong,
It is possible that you are facing the following hive bug? You may want to upgrade the current hive client. https://issues.apache.org/jira/browse/HIVE-2095 Thanks -Abdelrhman Hortonworks, Inc. Technical Support Engineer Abdelrahman Shettia [EMAIL PROTECTED] Office phone: (708) 689-9609 How am I doing? Please feel free to provide feedback to my manager Rick Morris at [EMAIL PROTECTED] On Feb 6, 2013, at 5:28 AM, Zhong Wang <[EMAIL PROTECTED]> wrote: > Hi all, > > I am running tests on Hive auto convert join. From the source code, it seems the conditional task will consider the intermediate table size and run the local task for generating hashtable on the intermediate table if it is smaller than the threshold of hive.mapjoin.smalltable.filesize. However, I ran a very simple query based on TPC-H: > > set hive.auto.convert.join=true; > > insert overwrite table q3_tmp > select c_custkey, o_orderkey, o_orderdate > from orders o join customer c on c.c_mktsegment = 'BUILDING' and > c.c_custkey = o.o_custkey > join lineitem l on l.l_orderkey = o.o_orderkey > where c.c_custkey < 1000; > > The intermediate table of c join o is very small (50KB), which is much less than the threshold. However, both the map joins of the intermediate table and lineitem are filtered by conditional task. Is this a bug of auto convert join or something wrong with my usage/analysis? > > Zhong +
Zhong Wang 2013-02-07, 07:24
|