Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> avoiding Group by or filter


Copy link to this message
-
Re: avoiding Group by or filter
because there is nothing to group
On Mar 5, 2013, at 3:14 AM, Jonathan Coveney <[EMAIL PROTECTED]> wrote:

> Why don't you want to group?
>
>
> 2013/3/5 Preeti Gupta <[EMAIL PROTECTED]>
>
>> I want to compute the Average for 1 column dataset
>> 1
>> 2
>> 3
>> 4
>> 5
>>
>> and I am not able to do without grouping.
>>
>> However I got an average with
>>
>> avg = foreach (group dividends all) generate AVG(dividends);
>>
>> But
>>
>> avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
>>
>> says use explicit cast.
>>
>> My script is very small
>>
>> dividends = load 'myfile.txt' as (A:double);
>> dump dividends
>> --grouped   = filter dividends by A>-10000000.0;
>> avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
>>
>>
>>
>> <file try.pig, line 5, column 65> Multiple matching functions for
>> org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}},
>> {{(double)}}). Please use an explicit cast.
>>
>>
>> On Mar 4, 2013, at 8:30 PM, Prashant Kommireddi <[EMAIL PROTECTED]>
>> wrote:
>>
>>> Hi Preeti,
>>>
>>> Using FILTER or not depends on your requirements and has nothing to do
>> with
>>> SUM or AVG.
>>>
>>> SUM, AVG accept bags as input, so as long as you are able to provide that
>>> it should be fine. (Though its very common that users use GROUP BY to
>>> rollup on a key before using these UDFs).
>>>
>>> For example:
>>>
>>> grunt> cat data
>>> 1    5
>>> 5    8
>>>
>>> grunt> A = load 'data';
>>> grunt> B = foreach A generate TOBAG($0, $1) as bagg;
>>> grunt> dump B;
>>> ({(1),(5)})
>>> ({(5),(8)})
>>>
>>> grunt> C = foreach B generate AVG(bagg);
>>> grunt> dump C;
>>> (3.0)
>>> (6.5)
>>>
>>> -Prashant
>>>
>>>
>>> On Mon, Mar 4, 2013 at 3:50 PM, Preeti Gupta <[EMAIL PROTECTED]
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> Can I compute SUM or AVG without using GROUPBY OR FILTER?
>>>>
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB