Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Possible bug in replicated join?


Copy link to this message
-
Re: Possible bug in replicated join?
That certainly looks like a bug. The replicated join should not flatten
the tuple.
I didn't actually know that pig supported doing joins on tuples (i guess
it does not allow that on maps and bags).

-Thejas
On 6/21/12 11:29 AM, Jonathan Coveney wrote:
> Am posting before making a ticket just to make sure I'm not doing something
> stupid or missing something obvious.
>
>
> $ cat data
>
> 1
>
> 2
>
> 3
>
> 4
>
> 5
>
>
> a = load 'data' as (x:int);
>
> b = foreach a generate TOTUPLE(x);
>
>
> c = load 'data' as (x:int);
>
> d = foreach c generate TOTUPLE(x);
>
>
> e = join b by $0, d by $0;
>
> dump e;
>
>
> ((1),(1))
>
> ((2),(2))
>
> ((3),(3))
>
> ((4),(4))
>
> ((5),(5))
>
> ok....
> but
> f = join b by $0, d by $0 using 'replicated';
>
> dump f;
>
>
> (1,1)
>
> (2,2)
>
> (3,3)
>
> (4,4)
>
> (5,5)
>
> !!!!
>