Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive Join with distinct rows


Copy link to this message
-
Hive Join with distinct rows
Hi,

I have 2 tables, T1 and T2 of structure columns= {ID, url, timestamp}

In T1 ID's are repeatable. But T2 ID's are kind of primary key hence only
unique values are present.

I want to join both tables T1 & T2 such that select all the unique ID's in
T1 which are not in  T2

I have written 2 queries:

Query1: select distinct T1.ID from T1 LEFT OUTER JOIN T2 on T1.ID=T2.ID AND
T2.ID=NULL

this gives me expected results, but with the below I am getting all the
rows in T1

Query2: select distinct(T1.ID), T1.url, T1.timestamp from T1 LEFT OUTER
JOIN T2 on T1.ID=T2.ID AND T2.ID=NULL

Can some one point me how to achieve: select all the unique ID's in T1
which are not in T2

Thanks,
Praveen