Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> add a field, ordered


Copy link to this message
-
Re: add a field, ordered
Hi,

We are finalizing a feature that would solve your problems, something like
ROW_NUMBER in some SQL dialect, we call it RANK.
This operator will add a unique consecutive row number to each tuple in the
relationship.
Then you will be able to join the two relationships on the rank field.

For the moment being, however, I think there is no easy way to achieve what
you want to do.

Cheers,
--
Gianmarco

On Tue, Aug 14, 2012 at 11:55 AM, Lauren Blau <
[EMAIL PROTECTED]> wrote:

> I  want to match up tuples from 2 relations. For each key, the 2 relations
> will always have the same number of tuples and match by position (the first
> tuple in each are a match, the second tuple in each, etc).
>
> so if I have
> relation1 = 5,9,7
> relation2 = z,a,d
>
> I want to end up with
>
> relation3 = (5,z),(9,a),(7,d)
>
> I figure I need a way to generate a matching key on the ordered tuples of
> the relations and then do a cogroup. But I'm stuck on generating the key.
> Since adding a field is a project, I assume this has to be done as part of
> a foreach loop. But I'm not sure how I can maintain the order while adding
> a field to each tuple.
>
> ideas?
> Thanks,
> lauren
>