Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Built - In Aggregate Function - Standard Deviation


Copy link to this message
-
Re: Built - In Aggregate Function - Standard Deviation
Zheng,

  is SQRT an undocumented builtin UDF? I couldn't see it in the lang
manual at:

http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF

hence why I did pow(x,0.5) instead

-- amr
> Hive does not support that right now, but Hive supports subquery so we
> can do:
>
> SELECT SQRT( (n*totalsqr - tot * tot) / (n*(n-1)) )
> FROM (SELECT COUNT(1) as n, SUM(col) as total, SUM( col*col ) AS
> totalsqr FROM t) t2;
>
> Zheng
>
> On Sat, May 30, 2009 at 4:08 PM, Matt Pestritto <[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>> wrote:
>
>     Ah - Thanks so much. I didn't know you could reference a column
>     that has an aggregate function call again in the same select
>     statement.  This is much cleaner that the approach that I took.
>     I'll give it a shot.
>
>     Thanks again.
>
>
>     On Wed, May 27, 2009 at 4:24 AM, Amr Awadallah <[EMAIL PROTECTED]
>     <mailto:[EMAIL PROTECTED]>> wrote:
>
>         I agree that a builtin for std dev is a good idea.
>
>         that said, you can achieve this easy in one pass, just use:
>
>         select sum( pow(col,2) ) as totsqr, sum( col ) as tot,
>         count(1) as n, pow( (n*totsqr - pow(tot,2) )/(n*(n-1)), 0.5)
>         as stddev
>         from ....
>
>
>         Matt Pestritto wrote:
>
>             Hi.
>
>             Are there plans to write a standard deviation aggregate
>             function ?  I had to build my own which translated into
>             multiple hive queries.  While it works, a build-in
>             function would have been much easier.
>
>             Thanks
>             -Matt
>
>
>
>
>
> --
> Yours,
> Zheng

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB