Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - single MR stage for join and group by


+
Chen Song 2013-08-01, 17:19
Copy link to this message
-
Re: single MR stage for join and group by
Stephen Sprague 2013-08-02, 00:32
and what version of hive are you running your test on?  i do believe - not
certain - that hive 0.11 includes the optimization you seek.
On Thu, Aug 1, 2013 at 10:19 AM, Chen Song <[EMAIL PROTECTED]> wrote:

> Suppose we have 2 simple tables
>
> A
> id int
> value string
>
> B
> id
>
> When hive translates the following query
>
> select max(A.value), A.id from A join B on A.id = A.id group by A.id;
>
> It launches 2 stages, one for the join and one for the group by.
>
> My understanding is that if the join key set is a sub set of the group by
> key set, it can be achieved in the same map reduce job. If that is correct
> in theory, could it be a feature in hive?
>
> Chen
>
>
+
Yin Huai 2013-08-02, 04:14
+
Chen Song 2013-08-02, 17:32