Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Count of all the rows


+
Mohit Anchlia 2012-08-29, 22:51
+
Alan Gates 2012-08-29, 23:03
+
Mohit Anchlia 2012-08-29, 23:41
Copy link to this message
-
Re: Count of all the rows
Alan Gates 2012-08-29, 23:52
Even in SQL when you do select count(*) you are actually grouping, the language just hides it from you.  

Each map/combiner counts the number of records it sees and sends that count to the reducer which sums the counts.  

Alan.

On Aug 29, 2012, at 4:41 PM, Mohit Anchlia wrote:

> Thanks! Why is grouping necessary? Is it to send it to the reducer?
>
> On Wed, Aug 29, 2012 at 4:03 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> A = load 'foo';
>> B = group A all;
>> C = foreach B generate COUNT(A);
>>
>> Alan.
>> On Aug 29, 2012, at 3:51 PM, Mohit Anchlia wrote:
>>
>>> How do I get count of all the rows? All the examples of COUNT use group
>> by.
>>
>>
+
Jonathan Coveney 2012-08-29, 23:51
+
Mohit Anchlia 2012-08-30, 16:57
+
Alan Gates 2012-09-04, 15:51
+
Mohit Anchlia 2012-08-30, 00:20