Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Map side join


+
Souvik Banerjee 2012-12-07, 19:58
Copy link to this message
-
Re: Map side join
Hi Souvik

In earlier versions of hive you had to give the map join hint. But in later versions just set hive.auto.convert.join = true;
Hive automatically selects the smaller table. It is better to give the smaller table as the first  one in join.

You can use a map join if you are joining a small table with a large one, in terms of data size. By small, better to have the smaller table size in range of MBs.

Regards
Bejoy KS

Sent from remote device, Please excuse typos

-----Original Message-----
From: Souvik Banerjee <[EMAIL PROTECTED]>
Date: Fri, 7 Dec 2012 13:58:25
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: Map side join

Hello everybody,

I have got a question. I didn't came across any post which says somethign
about this.
I have got two tables. Lets say A and B.
I want to join A & B in HIVE. I am currently using HIVE 0.9 version.
The join would be on few columns. like on (A.id1 = B.id1) AND (A.id2 B.id2) AND (A.id3 = B.id3)

Can I ask HIVE to use map side join in this scenario? Should I give a hint
to HIVE by saying /*+mapjoin(B)*/

Get back to me if you want any more information in this regard.

Thanks and regards,
Souvik.

+
Souvik Banerjee 2012-12-07, 20:32
+
Souvik Banerjee 2012-12-11, 23:12
+
bejoy_ks@... 2012-12-12, 14:04
+
Souvik Banerjee 2012-12-12, 20:27
+
bejoy_ks@... 2012-12-13, 17:12
+
Souvik Banerjee 2012-12-13, 18:00
+
bejoy_ks@... 2012-12-13, 18:06
+
Souvik Banerjee 2012-12-13, 18:36
+
Souvik Banerjee 2012-12-27, 23:05
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB