Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> HIVE left semi join multiple tables with columns from multiple table


+
Souvik Banerjee 2012-12-27, 23:16
+
Mark Grover 2012-12-28, 03:06
+
Souvik Banerjee 2012-12-28, 05:26
Copy link to this message
-
Re: HIVE left semi join multiple tables with columns from multiple table
Souvik,
In your new example, you need a RIGHT OUTER JOIN between table1 and table2
(order matters - table1 on left, table2 on right) on the ID1 column.
Something like this (untested by me):
SELECT
   table1.*,
   table2.*
FROM
   table1
   RIGHT OUTER JOIN table2
   ON (table1.id=table2.id_1);

Mark

On Thu, Dec 27, 2012 at 9:26 PM, Souvik Banerjee
<[EMAIL PROTECTED]>wrote:

> Thanks a lot Mark for your attention.
> But I think I cannot go for INNER join, the reason behind the fact being
> that I want all rows of Table3 irrespective of there is any row
> corresponding to that ID in Table 1 or Table 2.
> Probably I would have taken care of that while providing the example.
> The more refined example would be like below.
>
> Look forward for your help.
>
> Thanks and regards,
> Souvik.
>
> P.S. Dropped one table from the earlier example and Now we have table 1
> and table 2, we are looking for table 4. (Hope so it's more simple and it's
> exactly we need)
>
>
>
> *Table 1*
>
> ID
>
> Value
>
> 1
>
> V11
>
> 2
>
> V12
>
> 3
>
> V13
>
>
>
> *Table 2*
>
> ID_1
>
> ID_2
>
> Other_Column1
>
> 1
>
> 3
>
> C1
>
> 2
>
> 1
>
> C2
>
> 3
>
> 2
>
> C3
>
> 4
>
> 9
>
> C4
>
> 12
>
> 3
>
> C5
>
> 1
>
> 8
>
> C6
>
>
>
> I want to formulate a table which would look like (*Table 4*)
>
> ID_1
>
> ID_1_Value
>
> ID_2
>
> ID_2_value
>
> Other_Column1
>
> 1
>
> V11
>
> 3
>
> V13
>
> C1
>
> 2
>
> V12
>
> 1
>
> V11
>
> C2
>
> 3
>
> V13
>
> 2
>
> V12
>
> C3
>
> 4
>
> NULL / Empty
>
> 9
>
> NULL / Empty
>
> C4
>
> 12
>
> NULL / Empty
>
> 3
>
> V13
>
> C5
>
> 1
>
> V11
>
> 8
>
> NULL / Empty
>
> C6
>
>
> On Thu, Dec 27, 2012 at 9:06 PM, Mark Grover <[EMAIL PROTECTED]>wrote:
>
>> Souvik,
>> Let me begin by saying that simplifying the problem goes a long way in
>> helping us answer your question. You did it really nicely here, so thank
>> you for doing that.
>>
>> Why don't you use INNER JOIN instead of LEFT SEMI JOIN? You can express
>> the same query as INNER JOIN with no restrictions on what can be included
>> in the SELECT clause. LEFT SEMI JOIN just implements an "exists" like query
>> efficiently. If you want columns from the right table in your select list,
>> just do the regular (aka inefficient way) inner join.
>>
>> Of course, you can optimize inner joins as map joins, sorted merge join
>> or sorted merge bucketed joins depending on your use case.
>>
>> Mark
>>
>> On Thu, Dec 27, 2012 at 3:16 PM, Souvik Banerjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I am struggling with a problem described below.
>>> Any help how to resolve this problem is highly appreciated.
>>>
>>> I have got few tables the structure is over simplified for the sake of
>>> describing the nature of the problem that I am facing.
>>>
>>> *Table 1*
>>> * *
>>>
>>>
>>>
>>> ID
>>>
>>> Value
>>>
>>> 1
>>>
>>> V11
>>>
>>> 2
>>>
>>> V12
>>>
>>> 3
>>>
>>> V13
>>>
>>>
>>>
>>> *Table 2*
>>> * *
>>>
>>>
>>>
>>> ID
>>>
>>> Value
>>>
>>> 1
>>>
>>> V21
>>>
>>> 2
>>>
>>> V22
>>>
>>> 3
>>>
>>> V23
>>>
>>>
>>>
>>> *Table 3*
>>> * *
>>>
>>>
>>>
>>> ID_1
>>>
>>> ID_2
>>>
>>> Other_Column1
>>>
>>> 1
>>>
>>> 3
>>>
>>> C1
>>>
>>> 2
>>>
>>> 1
>>>
>>> C2
>>>
>>> 3
>>>
>>> 2
>>>
>>> C3
>>>
>>>
>>>
>>> I want to formulate a table which would look like
>>>
>>> ID_1
>>>
>>> ID_1_Value
>>>
>>> ID_2
>>>
>>> ID_2_value
>>>
>>> Other_Column1
>>>
>>> 1
>>>
>>> V11
>>>
>>> 3
>>>
>>> V23
>>>
>>> C1
>>>
>>> 2
>>>
>>> V12
>>>
>>> 1
>>>
>>> V21
>>>
>>> C2
>>>
>>> 3
>>>
>>> V13
>>>
>>> 2
>>>
>>> V22
>>>
>>> C3
>>>
>>>
>>>
>>> I am facing problem with this.
>>> I tried to LEFT SEMI JOIN in Hive.
>>>
>>> I tried to do it in two steps (For 3 tables)
>>> First step I wanted to do a LEFT SEMI JOIN with TABLE 1 and TABLE 3.
>>> But the problem is that in LEFT SEMI JOIN you can not have columns from
>>> the right table in the select clause. So after join my new table simply
>>> looks like TABLE 3.
>>>
>>> Can you help me how I can achieve this is HIVE.
>>>
>>> Thanks and regards,
>>> Souvik.
+
Souvik Banerjee 2012-12-29, 22:19
+
Mark Grover 2012-12-30, 02:55