Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - add a field, ordered


+
Lauren Blau 2012-08-14, 09:55
+
Gianmarco De Francisci Mo... 2012-08-14, 10:05
Copy link to this message
-
Re: add a field, ordered
Lauren Blau 2012-08-14, 10:38
Is the source for it available in the development area? I'd be happy to
help if I can.
Lauren

On Tue, Aug 14, 2012 at 6:05 AM, Gianmarco De Francisci Morales <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> We are finalizing a feature that would solve your problems, something like
> ROW_NUMBER in some SQL dialect, we call it RANK.
> This operator will add a unique consecutive row number to each tuple in the
> relationship.
> Then you will be able to join the two relationships on the rank field.
>
> For the moment being, however, I think there is no easy way to achieve what
> you want to do.
>
> Cheers,
> --
> Gianmarco
>
>
>
> On Tue, Aug 14, 2012 at 11:55 AM, Lauren Blau <
> [EMAIL PROTECTED]> wrote:
>
> > I  want to match up tuples from 2 relations. For each key, the 2
> relations
> > will always have the same number of tuples and match by position (the
> first
> > tuple in each are a match, the second tuple in each, etc).
> >
> > so if I have
> > relation1 = 5,9,7
> > relation2 = z,a,d
> >
> > I want to end up with
> >
> > relation3 = (5,z),(9,a),(7,d)
> >
> > I figure I need a way to generate a matching key on the ordered tuples of
> > the relations and then do a cogroup. But I'm stuck on generating the key.
> > Since adding a field is a project, I assume this has to be done as part
> of
> > a foreach loop. But I'm not sure how I can maintain the order while
> adding
> > a field to each tuple.
> >
> > ideas?
> > Thanks,
> > lauren
> >
>
+
Alan Gates 2012-08-23, 20:43