Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - COUNT(A.field1)


Copy link to this message
-
COUNT(A.field1)
Corbin Hoenes 2010-08-25, 20:58
Wondering about performance and count...
A =  load 'test.csv' as (a1,a2,a3);
B = GROUP A by a1;
-- which preferred?
C = FOREACH B GENERATE COUNT(A);
-- or would this only send a single field through the COUNT and be more performant?
C = FOREACH B GENERATE COUNT(A.a2);