Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> A bug of auto convert join with intermediate table?

Copy link to this message
Re: A bug of auto convert join with intermediate table?
Hi Zhong,

It is possible that you are facing the following hive bug? You may want to upgrade the current hive client.  
Hortonworks, Inc.
Technical Support Engineer
Abdelrahman Shettia
Office phone: (708) 689-9609
How am I doing?   Please feel free to provide feedback to my manager Rick Morris at [EMAIL PROTECTED]
On Feb 6, 2013, at 5:28 AM, Zhong Wang <[EMAIL PROTECTED]> wrote:

> Hi all,
> I am running tests on Hive auto convert join. From the source code, it seems the conditional task will consider the intermediate table size and run the local task for generating hashtable on the intermediate table if it is smaller than the threshold of hive.mapjoin.smalltable.filesize. However, I ran a very simple query based on TPC-H:
> set hive.auto.convert.join=true;
> insert overwrite table q3_tmp
> select c_custkey, o_orderkey, o_orderdate
> from orders o join customer c on c.c_mktsegment = 'BUILDING' and
> c.c_custkey = o.o_custkey
> join lineitem l on l.l_orderkey = o.o_orderkey
> where c.c_custkey < 1000;
> The intermediate table of c join o is very small (50KB), which is much less than the threshold. However, both the map joins of the intermediate table and lineitem are filtered by conditional task. Is this a bug of auto convert join or something wrong with my usage/analysis?
> Zhong