Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - FILTER and fields from tuple/bags


Copy link to this message
-
FILTER and fields from tuple/bags
Mohit Anchlia 2012-04-12, 00:27
I am new to pig and I have gone through the reference. I am getting used to
how this works but I keep getting questions as I write my scripts. I have
couple of questions:

i) I use FILTER with FOREACH? Below I am trying to FILTER, JOIN and MERGE
into one row. But in the end I get all the fields in form of row which
seems to have Bags inside tuples. In the end all I want is to output values
of some of the fields from each row in "a,b,c" format. How can I do that?
NM_CT_ST_FILTER = FILTER A by (FIELD_ID == 'NAM2' OR FIELD_ID == 'CITY' OR
FIELD_ID == 'ST' OR FIELD_ID == 'ZIP');

AG_OC_MT_FILTER = FILTER A by (FIELD_ID == 'AGE' OR FIELD_ID == 'OCCUP' OR
FIELD_ID == 'MARITAL') AND FORM_ID == 'FPERSWKS' AND FORM_COPY_NUM == '1';

NM_CT_ST = JOIN NM_CT_ST_FILTER BY (FILE_NAME,CREATED_DATE), D BY
(FILE_NAME,CREATED_DATE);

AG_OC_MT = JOIN AG_OC_MT_FILTER BY
(FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT), D BY
(FILE_NAME,CREATED_DATE,FORM_ID,FORM_ID_ROOT);

FINAL = COGROUP NM_CT_ST BY (D::FILE_NAME,D::CREATED_DATE), AG_OC_MT BY
(D::FILE_NAME,D::CREATED_DATE);

2) Is it possible to use FILTER with foreach? something like foreach A
GENERATE B FILTER FIELD BY .. OR FIELD BY ..