Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Count of all the rows


Copy link to this message
-
Re: Count of all the rows
Even in SQL when you do select count(*) you are actually grouping, the language just hides it from you.  

Each map/combiner counts the number of records it sees and sends that count to the reducer which sums the counts.  

Alan.

On Aug 29, 2012, at 4:41 PM, Mohit Anchlia wrote:

> Thanks! Why is grouping necessary? Is it to send it to the reducer?
>
> On Wed, Aug 29, 2012 at 4:03 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> A = load 'foo';
>> B = group A all;
>> C = foreach B generate COUNT(A);
>>
>> Alan.
>> On Aug 29, 2012, at 3:51 PM, Mohit Anchlia wrote:
>>
>>> How do I get count of all the rows? All the examples of COUNT use group
>> by.
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB