Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Union in Multi Insert


Copy link to this message
-
Union in Multi Insert
I am trying to do a union, group by, and multi insert all at once. I know
this convoluted but I what I am trying to do is avoid having to scan
through the original table more than once... if I can get all my data from
two columns that I want to pull together, in one round of mappers, I win...
 Basically, col2 and col3 are they type of data, one is src, one is dst, I
want a single record for every uniq value of both col2 and col3.

Any thoughts?

from (
    select
       col1, col2, col3, col4
    from table
) a
INSERT overwrite TABLE table_summary
select col1, unioned_col, count(1), count(distinct col4) from
(select col1, col2 as unioned_col, col4
UNION ALL
select col1, col3 as unioned_col, col4
) b
group by col1, unioned_col