Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - field name reference - alias


Copy link to this message
-
field name reference - alias
Keren Ouaknine 2013-08-09, 01:59
Hello,

Can one refer to a field name with no ambiguity by its full name (A::x
instead of x)? Below are two contradictory behaviors:
*
*
*First example:*
A = load '1.txt'      using PigStorage(' ')  as (x:int, y:chararray,
z:chararray);
B = load '1_ext.txt'  using PigStorage(' ')  as (a:int, b:chararray,
c:chararray);
C = JOIN A by x LEFT OUTER, B BY a;
D = FOREACH C GENERATE A::x as toto;
describe C;
describe D;

*output:*
C: {A::x: int,A::y: chararray,A::z: chararray,B::a: int,B::b:
chararray,B::c: chararray}
D: {toto: int}

Works fine also if you refer to A:: x as x.

*Second example with toMap:*
A = load '1.txt'  using PigStorage(' ')  as (x:int, y:chararray,
z:chararray);
B = FOREACH A GENERATE TOMAP('toto', x);
describe B;
DUMP B;
store B into '/home/kereno/Documents/pig-0.11.1/workspace/res';

*output:*
C: {map[]}

If you change the script to refer to A::x, you would get an error as follow:
A = load '1.txt'  using PigStorage(' ')  as (x:int, y:chararray,
z:chararray);
B = FOREACH A GENERATE TOMAP('toto', A::x);
describe B;
DUMP B;
store B into '/home/kereno/Documents/pig-0.11.1/workspace/res';

output
<file tomap.pig, line 2, column 37> Invalid field projection. Projected
field [A::x] does not exist in schema: x:int,y:chararray,z:chararray.

My question is why is it that for the FOREACH I can use either and not for
the TOMAP??
side node: I am asking cause I am generating schemas of a Pig script and
use these as input for another language (project translating Pig to
Algebricks) and would like to be consistent with the Pig behavior :).

Thanks,
Keren

--
Keren Ouaknine
Web: www.kereno.com