Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Count of all the rows


+
Mohit Anchlia 2012-08-29, 22:51
+
Alan Gates 2012-08-29, 23:03
+
Mohit Anchlia 2012-08-29, 23:41
Copy link to this message
-
Re: Count of all the rows
Even in SQL when you do select count(*) you are actually grouping, the language just hides it from you.  

Each map/combiner counts the number of records it sees and sends that count to the reducer which sums the counts.  

Alan.

On Aug 29, 2012, at 4:41 PM, Mohit Anchlia wrote:

> Thanks! Why is grouping necessary? Is it to send it to the reducer?
>
> On Wed, Aug 29, 2012 at 4:03 PM, Alan Gates <[EMAIL PROTECTED]> wrote:
>
>> A = load 'foo';
>> B = group A all;
>> C = foreach B generate COUNT(A);
>>
>> Alan.
>> On Aug 29, 2012, at 3:51 PM, Mohit Anchlia wrote:
>>
>>> How do I get count of all the rows? All the examples of COUNT use group
>> by.
>>
>>
+
Jonathan Coveney 2012-08-29, 23:51
+
Mohit Anchlia 2012-08-30, 16:57
+
Alan Gates 2012-09-04, 15:51
+
Mohit Anchlia 2012-08-30, 00:20
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB