Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - UNION ALL query behaving strangely; WHERE CLAUSE is also not being honored


Copy link to this message
-
Re: UNION ALL query behaving strangely; WHERE CLAUSE is also not being honored
Yin Huai 2013-09-13, 00:27
Hi,

Can you also attach the query plan (the result of EXPLAIN)? It may help to
find where is the problem.

Thanks,

Yin
On Thu, Sep 12, 2013 at 1:00 PM, Chuck Hardin <[EMAIL PROTECTED]> wrote:

> Please bear with me, because this is a pretty large query.
>
> TL;DR:  I'm doing a UNION ALL on a bunch of subqueries.  The subqueries
> return no results, but the overall query does.  Sadly, the constraints on
> the subqueries are not being honored in the UNION ALL.
>
> I'm executing the following:
>
> SELECT lax_year,lax_month,lax_day,network_id,advertiser_id,publisher_id,
>        SUM(entries.clicks), SUM(entries.conversions),
>        SUM(entries.cost), SUM(entries.feed_calls),
>        SUM(entries.impressions), SUM(entries.revenue),
>        SUM(entries.used_listings)
> FROM
>   (
>     SELECT
>       used_listing.advertiserid AS advertiser_id,
>       -1 AS keyword,
>       used_listing.adgroupid AS adgroup_id,
>       used_listing.adid AS ad_id,
>       used_listing.feedid AS feed_id,
>       0 AS clicks,
>       used_listing.campaignid AS campaign_id,
>       concat(used_listing.publisherid, "_", used_listing.sourceid, "_",
> used_listing.subid) AS subid,
>       0 AS conversions,
>       0 AS feed_calls,
>       0 AS impressions,
>       2013 AS lax_year,
>       used_listing.publisherid AS publisher_id,
>       0 AS cost,
>       used_listing.subid AS sub_id,
>       used_listing.sourceid AS source_id,
>       used_listing.timezone AS timezone,
>       09 AS lax_month,
>       0 AS revenue,
>       used_listing.networkid AS network_id,
>       05 AS lax_day,
>       1 AS used_listings,
>       used_listing.ul_datetime AS start_timestamp,
>       -1 AS bid_id FROM used_listing
> WHERE used_listing.ul_datetime >= 1378364400
> AND   used_listing.ul_datetime < 1378450800
> AND used_listing.networkid='e6ac40577bcc157a00f7f2d4f5bf30cc0bca6120'
>
>     UNION ALL
>     SELECT
>       click.click.advertiser_id AS advertiser_id,
>       click.click.keyword AS keyword,
>       click.click.adgroup_id AS adgroup_id,
>       click.click.ad_id AS ad_id,
>       click.feed_id AS feed_id,
>       1 AS clicks,
>       click.click.campaign_id AS campaign_id,
>       click.click.subid AS subid,
>       0 AS conversions,
>       0 AS feed_calls,
>       0 AS impressions,
>       2013 AS lax_year,
>       click.click.publisher_id AS publisher_id,
>       click.click.publisher_cost AS cost,
>       click.click.sub_id AS sub_id,
>       click.click.source_id AS source_id,
>       click.click.timezone AS timezone,
>       09 AS lax_month,
>       click.click.click_cost AS revenue,
>       click.click.network_id AS network_id,
>       05 AS lax_day,
>       0 AS used_listings,
>       click.start_timestamp AS start_timestamp,
>       click.click.bid_id AS bid_id FROM click
> WHERE click.start_timestamp >= 1378364400
> AND   click.start_timestamp < 1378450800
> AND click.click.network_id='e6ac40577bcc157a00f7f2d4f5bf30cc0bca6120'
>
>     UNION ALL
>     SELECT
>       result.ad.advertiser_id AS advertiser_id,
>       result.bid.text AS keyword,
>       result.ad.adgroup_id AS adgroup_id,
>       result.ad.ad_id AS ad_id,
>       result.ad.feed_id AS feed_id,
>       0 AS clicks,
>       result.ad.campaign_id AS campaign_id,
>       concat(result.ad.publisher_id, "_", result.ad.source_id, "_",
> result.ad.sub_id) AS subid,
>       0 AS conversions,
>       0 AS feed_calls,
>       1 AS impressions,
>       2013 AS lax_year,
>       result.ad.publisher_id AS publisher_id,
>       0 AS cost,
>       result.ad.sub_id AS sub_id,
>       result.ad.source_id AS source_id,
>       result.ad.timezone AS timezone,
>       09 AS lax_month,
>       0 AS revenue,
>       result.ad.network_id AS network_id,
>       05 AS lax_day,
>       0 AS used_listings,
>       result.ad.start_timestamp AS start_timestamp,
>       result.bid.bid_id AS bid_id FROM impression
> WHERE result.ad.start_timestamp >= 1378364400
> AND   result.ad.start_timestamp < 1378450800
> AND result.ad.network_id='e6ac40577bcc157a00f7f2d4f5bf30cc0bca6120'