Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> union


Hi,

According to Pig's documention on union, two schemas which have the same
schema (have the same length and  types can be implicitly cast) can be
concatenated (see http://pig.apache.org/docs/r0.11.1/basic.html#union)

However, when I try with:
A = load '1.txt'          using PigStorage(' ')  as (x:int, y:chararray,
z:chararray);
B = load '1_ext.txt'  using PigStorage(' ')  as (a:int, b:chararray,
c:chararray);
C = union A, B;
describe C;
DUMP C;
store C into '/home/kereno/Documents/pig-0.11.1/workspace/res';

with:
~/Documents/pig-0.11.1/workspace 130$ more 1.txt 1_ext.txt
::::::::::::::
1.txt
::::::::::::::
1 a aleph
2 b bet
3 g gimel
::::::::::::::
1_ext.txt
::::::::::::::
0 a alpha
0 b beta
0 g gimel
I get in result:~/Documents/pig-0.11.1/workspace 0$ more res/part-m-0000*
::::::::::::::
res/part-m-00000
::::::::::::::
0 a alpha
0 b beta
0 g gimel
 ::::::::::::::
res/part-m-00001
::::::::::::::
1 a aleph
2 b bet
3 g gimel

Whereas I was expecting something like
0 a alpha
0 b beta
0 g gimel
1 a aleph
2 b bet
3 g gimel

[all together]

I understand that two files for non-matching schemas would be generated but
why for union with a matching schema?

Thanks,
Keren

--
Keren Ouaknine
Web: www.kereno.com