Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> DISTINCT with 2 fields in a tuple


Copy link to this message
-
DISTINCT with 2 fields in a tuple
I am trying to get distinct from 2 fields in a record. something like
select distinct a, b from c; So I wrote this in pig which is actually not
working. I did:
A = LOAD '/examples/form_out/part-m-00000' USING PigStorage('\t') AS
(FILE_NAME:chararray,FORM_ID:chararray,SET_ID:chararray);

B = foreach A {dist = DISTINCT A.FORM_ID, A.SET_ID; GENERATE dist;}

ERROR 1000: Error during parsing. Invalid alias: A in {FILE_NAME: chararray
...

But this doesn't seem to be working. I thought A is a tuple and form_id and
set_id are fields that I can do DISTINCT on. I saw similar example online
but not exactly same.