Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Query output formatting


Copy link to this message
-
Re: Query output formatting
If you have a fixed number of known CDNs, the following query can help:

SELECT  hour,
               SUM(IF(cdn=8, bitrate,0))/SUM(IF(cdn=8, 1, 0))
avgBitrateCdn8,
               SUM(IF(cdn=9, bitrate,0))/SUM(IF(cdn=9, 1, 0)) avgBitrateCdn9
-- You will need more IFs to handle 0 denominators.
FROM fact_table
GROUP BY hour

Dilip

On Mon, Dec 6, 2010 at 1:01 PM, Steven Wong <[EMAIL PROTECTED]> wrote:

>  I have this query to calculate some averages:
>
>
>
> select hour, cdn, avg(bitrate) from fact_table group by hour, cdn
>
> 1              8              a
>
> 1              9              b
>
> 2              8              c
>
> 3              8              d
>
> 3              9              e
>
>
>
> But I want the output to be in the following format so that it can be
> eyeballed/graphed more easily:
>
>
>
> 1              a              b
>
> 2              c              NULL
>
> 3              d             e
>
>
>
> (The 2nd and 3rd columns are averages for cdn 8 and 9, respectively.)
>
>
>
> Is there an easy way to do this – in Hive, Unix, etc.? Suggestions (or,
> better yet, solutions) are welcome.
>
>
>
> I imagine a UDAF could do this (see below), but AFAIK it is not built into
> Hive.
>
>
>
> select hour, some_udaf(abr, cdn, array(8, 9))
>
> from (select hour, cdn, avg(bitrate) as abr from fact_table group by hour,
> cdn) t
>
> group by hour
>
>
>
> Thanks.
>
> Steven
>
>
>

--
_________________________________________
Dilip Antony Joseph
http://csgrad.blogspot.com
http://www.marydilip.info