Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - mapjoin on subquery result


Copy link to this message
-
mapjoin on subquery result
Sukhendu Chakraborty 2013-12-04, 17:05
Hi,

Is there anyway mapjoin works on the subquery(not the underlying table). I
have the following query:

select external_id,count(category_id) from
catalog_products_in_categories_orc pc inner join (select * from
catalog_products_orc where s_id=118) p on pc.product_id=p.id   group by
external_id;
Now, even though catalog_products_orc is a big table, after filtering
(s_id=118) it results in very few number of rows which can be easily
optimized to a mapjoin (with catalog_products_in_categories_orc as the big
table and the subquery result as the small table) . However, when I try to
specify /*+MAPJOIN(p)*/ to enforce this, it results in a mapjoin for the
table catalog_products_orc (and not on the subquery after filtering).

Any ideas to achieve mapjoin on a subquery (and not the underlying table)?
-Sukhendu