Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - avoiding Group by or filter


+
Preeti Gupta 2013-03-04, 23:50
+
Prashant Kommireddi 2013-03-05, 04:30
+
Preeti Gupta 2013-03-05, 04:36
+
Jonathan Coveney 2013-03-05, 11:14
Copy link to this message
-
Re: avoiding Group by or filter
Preeti Gupta 2013-03-05, 15:10
because there is nothing to group
On Mar 5, 2013, at 3:14 AM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> Why don't you want to group?
>
>
> 2013/3/5 Preeti Gupta <[EMAIL PROTECTED]>
>
>> I want to compute the Average for 1 column dataset
>> 1
>> 2
>> 3
>> 4
>> 5
>>
>> and I am not able to do without grouping.
>>
>> However I got an average with
>>
>> avg = foreach (group dividends all) generate AVG(dividends);
>>
>> But
>>
>> avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
>>
>> says use explicit cast.
>>
>> My script is very small
>>
>> dividends = load 'myfile.txt' as (A:double);
>> dump dividends
>> --grouped   = filter dividends by A>-10000000.0;
>> avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
>>
>>
>>
>> <file try.pig, line 5, column 65> Multiple matching functions for
>> org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}},
>> {{(double)}}). Please use an explicit cast.
>>
>>
>> On Mar 4, 2013, at 8:30 PM, Prashant Kommireddi <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Hi Preeti,
>>>
>>> Using FILTER or not depends on your requirements and has nothing to do
>> with
>>> SUM or AVG.
>>>
>>> SUM, AVG accept bags as input, so as long as you are able to provide that
>>> it should be fine. (Though its very common that users use GROUP BY to
>>> rollup on a key before using these UDFs).
>>>
>>> For example:
>>>
>>> grunt> cat data
>>> 1    5
>>> 5    8
>>>
>>> grunt> A = load 'data';
>>> grunt> B = foreach A generate TOBAG($0, $1) as bagg;
>>> grunt> dump B;
>>> ({(1),(5)})
>>> ({(5),(8)})
>>>
>>> grunt> C = foreach B generate AVG(bagg);
>>> grunt> dump C;
>>> (3.0)
>>> (6.5)
>>>
>>> -Prashant
>>>
>>>
>>> On Mon, Mar 4, 2013 at 3:50 PM, Preeti Gupta <[EMAIL PROTECTED]
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Can I compute SUM or AVG without using GROUPBY OR FILTER?
>>>>
>>
>>
+
Eli Finkelshteyn 2013-03-05, 02:11
+
Jonathan Coveney 2013-03-05, 22:06