Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Count of all the rows

Copy link to this message
Re: Count of all the rows
Even in SQL when you do select count(*) you are actually grouping, the language just hides it from you.  

Each map/combiner counts the number of records it sees and sends that count to the reducer which sums the counts.  


On Aug 29, 2012, at 4:41 PM, Mohit Anchlia wrote:

> Thanks! Why is grouping necessary? Is it to send it to the reducer?
> On Wed, Aug 29, 2012 at 4:03 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>> A = load 'foo';
>> B = group A all;
>> C = foreach B generate COUNT(A);
>> Alan.
>> On Aug 29, 2012, at 3:51 PM, Mohit Anchlia wrote:
>>> How do I get count of all the rows? All the examples of COUNT use group
>> by.