Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> single MR stage for join and group by


+
Chen Song 2013-08-01, 17:19
Copy link to this message
-
Re: single MR stage for join and group by
and what version of hive are you running your test on?  i do believe - not
certain - that hive 0.11 includes the optimization you seek.
On Thu, Aug 1, 2013 at 10:19 AM, Chen Song <[EMAIL PROTECTED]> wrote:

> Suppose we have 2 simple tables
>
> A
> id int
> value string
>
> B
> id
>
> When hive translates the following query
>
> select max(A.value), A.id from A join B on A.id = A.id group by A.id;
>
> It launches 2 stages, one for the join and one for the group by.
>
> My understanding is that if the join key set is a sub set of the group by
> key set, it can be achieved in the same map reduce job. If that is correct
> in theory, could it be a feature in hive?
>
> Chen
>
>
+
Yin Huai 2013-08-02, 04:14
+
Chen Song 2013-08-02, 17:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB