Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> single MR stage for join and group by


Copy link to this message
-
Re: single MR stage for join and group by
and what version of hive are you running your test on?  i do believe - not
certain - that hive 0.11 includes the optimization you seek.
On Thu, Aug 1, 2013 at 10:19 AM, Chen Song <[EMAIL PROTECTED]> wrote:

> Suppose we have 2 simple tables
>
> A
> id int
> value string
>
> B
> id
>
> When hive translates the following query
>
> select max(A.value), A.id from A join B on A.id = A.id group by A.id;
>
> It launches 2 stages, one for the join and one for the group by.
>
> My understanding is that if the join key set is a sub set of the group by
> key set, it can be achieved in the same map reduce job. If that is correct
> in theory, could it be a feature in hive?
>
> Chen
>
>