Good catch. I just added that to wiki.
Thanks Amr.
Zheng
On Sun, May 31, 2009 at 12:04 AM, Amr Awadallah <[EMAIL PROTECTED]> wrote:
> Zheng,
>
> is SQRT an undocumented builtin UDF? I couldn't see it in the lang manual
> at:
>
>
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF>
> hence why I did pow(x,0.5) instead
>
> -- amr
>
> Hive does not support that right now, but Hive supports subquery so we can
> do:
>
> SELECT SQRT( (n*totalsqr - tot * tot) / (n*(n-1)) )
> FROM (SELECT COUNT(1) as n, SUM(col) as total, SUM( col*col ) AS totalsqr
> FROM t) t2;
>
> Zheng
>
> On Sat, May 30, 2009 at 4:08 PM, Matt Pestritto <[EMAIL PROTECTED]>wrote:
>
>> Ah - Thanks so much. I didn't know you could reference a column that has
>> an aggregate function call again in the same select statement. This is much
>> cleaner that the approach that I took. I'll give it a shot.
>>
>> Thanks again.
>>
>> On Wed, May 27, 2009 at 4:24 AM, Amr Awadallah <[EMAIL PROTECTED]> wrote:
>>
>>> I agree that a builtin for std dev is a good idea.
>>>
>>> that said, you can achieve this easy in one pass, just use:
>>>
>>> select sum( pow(col,2) ) as totsqr, sum( col ) as tot, count(1) as n,
>>> pow( (n*totsqr - pow(tot,2) )/(n*(n-1)), 0.5) as stddev
>>> from ....
>>>
>>> Matt Pestritto wrote:
>>>
>>>> Hi.
>>>>
>>>> Are there plans to write a standard deviation aggregate function ? I
>>>> had to build my own which translated into multiple hive queries. While it
>>>> works, a build-in function would have been much easier.
>>>>
>>>> Thanks
>>>> -Matt
>>>>
>>>
>>
>
>
> --
> Yours,
> Zheng
>
>
>
--
Yours,
Zheng