-Re: single MR stage for join and group by
Stephen Sprague 2013-08-02, 00:32
and what version of hive are you running your test on? i do believe - not
certain - that hive 0.11 includes the optimization you seek.
On Thu, Aug 1, 2013 at 10:19 AM, Chen Song <[EMAIL PROTECTED]> wrote:
> Suppose we have 2 simple tables
> id int
> value string
> When hive translates the following query
> select max(A.value), A.id from A join B on A.id = A.id group by A.id;
> It launches 2 stages, one for the join and one for the group by.
> My understanding is that if the join key set is a sub set of the group by
> key set, it can be achieved in the same map reduce job. If that is correct
> in theory, could it be a feature in hive?