-RE: Query output formatting
Steven Wong 2010-12-07, 06:36
Good tip, thanks!
From: Dilip Joseph [mailto:[EMAIL PROTECTED]]
Sent: Monday, December 06, 2010 4:33 PM
To: [EMAIL PROTECTED]
Subject: Re: Query output formatting
If you have a fixed number of known CDNs, the following query can help:
SUM(IF(cdn=8, bitrate,0))/SUM(IF(cdn=8, 1, 0)) avgBitrateCdn8,
SUM(IF(cdn=9, bitrate,0))/SUM(IF(cdn=9, 1, 0)) avgBitrateCdn9 -- You will need more IFs to handle 0 denominators.
GROUP BY hour
On Mon, Dec 6, 2010 at 1:01 PM, Steven Wong <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
I have this query to calculate some averages:
select hour, cdn, avg(bitrate) from fact_table group by hour, cdn
1 8 a
1 9 b
2 8 c
3 8 d
3 9 e
But I want the output to be in the following format so that it can be eyeballed/graphed more easily:
1 a b
2 c NULL
3 d e
(The 2nd and 3rd columns are averages for cdn 8 and 9, respectively.)
Is there an easy way to do this - in Hive, Unix, etc.? Suggestions (or, better yet, solutions) are welcome.
I imagine a UDAF could do this (see below), but AFAIK it is not built into Hive.
select hour, some_udaf(abr, cdn, array(8, 9))
from (select hour, cdn, avg(bitrate) as abr from fact_table group by hour, cdn) t
group by hour
Dilip Antony Joseph