Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> HIVE left semi join multiple tables with columns from multiple table


Copy link to this message
-
Re: HIVE left semi join multiple tables with columns from multiple table
Souvik,
In your new example, you need a RIGHT OUTER JOIN between table1 and table2
(order matters - table1 on left, table2 on right) on the ID1 column.
Something like this (untested by me):
SELECT
   table1.*,
   table2.*
FROM
   table1
   RIGHT OUTER JOIN table2
   ON (table1.id=table2.id_1);

Mark

On Thu, Dec 27, 2012 at 9:26 PM, Souvik Banerjee
<[EMAIL PROTECTED]>wrote:

> Thanks a lot Mark for your attention.
> But I think I cannot go for INNER join, the reason behind the fact being
> that I want all rows of Table3 irrespective of there is any row
> corresponding to that ID in Table 1 or Table 2.
> Probably I would have taken care of that while providing the example.
> The more refined example would be like below.
>
> Look forward for your help.
>
> Thanks and regards,
> Souvik.
>
> P.S. Dropped one table from the earlier example and Now we have table 1
> and table 2, we are looking for table 4. (Hope so it's more simple and it's
> exactly we need)
>
>
>
> *Table 1*
>
> ID
>
> Value
>
> 1
>
> V11
>
> 2
>
> V12
>
> 3
>
> V13
>
>
>
> *Table 2*
>
> ID_1
>
> ID_2
>
> Other_Column1
>
> 1
>
> 3
>
> C1
>
> 2
>
> 1
>
> C2
>
> 3
>
> 2
>
> C3
>
> 4
>
> 9
>
> C4
>
> 12
>
> 3
>
> C5
>
> 1
>
> 8
>
> C6
>
>
>
> I want to formulate a table which would look like (*Table 4*)
>
> ID_1
>
> ID_1_Value
>
> ID_2
>
> ID_2_value
>
> Other_Column1
>
> 1
>
> V11
>
> 3
>
> V13
>
> C1
>
> 2
>
> V12
>
> 1
>
> V11
>
> C2
>
> 3
>
> V13
>
> 2
>
> V12
>
> C3
>
> 4
>
> NULL / Empty
>
> 9
>
> NULL / Empty
>
> C4
>
> 12
>
> NULL / Empty
>
> 3
>
> V13
>
> C5
>
> 1
>
> V11
>
> 8
>
> NULL / Empty
>
> C6
>
>
> On Thu, Dec 27, 2012 at 9:06 PM, Mark Grover <[EMAIL PROTECTED]>wrote:
>
>> Souvik,
>> Let me begin by saying that simplifying the problem goes a long way in
>> helping us answer your question. You did it really nicely here, so thank
>> you for doing that.
>>
>> Why don't you use INNER JOIN instead of LEFT SEMI JOIN? You can express
>> the same query as INNER JOIN with no restrictions on what can be included
>> in the SELECT clause. LEFT SEMI JOIN just implements an "exists" like query
>> efficiently. If you want columns from the right table in your select list,
>> just do the regular (aka inefficient way) inner join.
>>
>> Of course, you can optimize inner joins as map joins, sorted merge join
>> or sorted merge bucketed joins depending on your use case.
>>
>> Mark
>>
>> On Thu, Dec 27, 2012 at 3:16 PM, Souvik Banerjee <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Hi,
>>>
>>> I am struggling with a problem described below.
>>> Any help how to resolve this problem is highly appreciated.
>>>
>>> I have got few tables the structure is over simplified for the sake of
>>> describing the nature of the problem that I am facing.
>>>
>>> *Table 1*
>>> * *
>>>
>>>
>>>
>>> ID
>>>
>>> Value
>>>
>>> 1
>>>
>>> V11
>>>
>>> 2
>>>
>>> V12
>>>
>>> 3
>>>
>>> V13
>>>
>>>
>>>
>>> *Table 2*
>>> * *
>>>
>>>
>>>
>>> ID
>>>
>>> Value
>>>
>>> 1
>>>
>>> V21
>>>
>>> 2
>>>
>>> V22
>>>
>>> 3
>>>
>>> V23
>>>
>>>
>>>
>>> *Table 3*
>>> * *
>>>
>>>
>>>
>>> ID_1
>>>
>>> ID_2
>>>
>>> Other_Column1
>>>
>>> 1
>>>
>>> 3
>>>
>>> C1
>>>
>>> 2
>>>
>>> 1
>>>
>>> C2
>>>
>>> 3
>>>
>>> 2
>>>
>>> C3
>>>
>>>
>>>
>>> I want to formulate a table which would look like
>>>
>>> ID_1
>>>
>>> ID_1_Value
>>>
>>> ID_2
>>>
>>> ID_2_value
>>>
>>> Other_Column1
>>>
>>> 1
>>>
>>> V11
>>>
>>> 3
>>>
>>> V23
>>>
>>> C1
>>>
>>> 2
>>>
>>> V12
>>>
>>> 1
>>>
>>> V21
>>>
>>> C2
>>>
>>> 3
>>>
>>> V13
>>>
>>> 2
>>>
>>> V22
>>>
>>> C3
>>>
>>>
>>>
>>> I am facing problem with this.
>>> I tried to LEFT SEMI JOIN in Hive.
>>>
>>> I tried to do it in two steps (For 3 tables)
>>> First step I wanted to do a LEFT SEMI JOIN with TABLE 1 and TABLE 3.
>>> But the problem is that in LEFT SEMI JOIN you can not have columns from
>>> the right table in the select clause. So after join my new table simply
>>> looks like TABLE 3.
>>>
>>> Can you help me how I can achieve this is HIVE.
>>>
>>> Thanks and regards,
>>> Souvik.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB