Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> add a field, ordered


+
Lauren Blau 2012-08-14, 09:55
+
Gianmarco De Francisci Mo... 2012-08-14, 10:05
Copy link to this message
-
Re: add a field, ordered
Is the source for it available in the development area? I'd be happy to
help if I can.
Lauren

On Tue, Aug 14, 2012 at 6:05 AM, Gianmarco De Francisci Morales <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> We are finalizing a feature that would solve your problems, something like
> ROW_NUMBER in some SQL dialect, we call it RANK.
> This operator will add a unique consecutive row number to each tuple in the
> relationship.
> Then you will be able to join the two relationships on the rank field.
>
> For the moment being, however, I think there is no easy way to achieve what
> you want to do.
>
> Cheers,
> --
> Gianmarco
>
>
>
> On Tue, Aug 14, 2012 at 11:55 AM, Lauren Blau <
> [EMAIL PROTECTED]> wrote:
>
> > I  want to match up tuples from 2 relations. For each key, the 2
> relations
> > will always have the same number of tuples and match by position (the
> first
> > tuple in each are a match, the second tuple in each, etc).
> >
> > so if I have
> > relation1 = 5,9,7
> > relation2 = z,a,d
> >
> > I want to end up with
> >
> > relation3 = (5,z),(9,a),(7,d)
> >
> > I figure I need a way to generate a matching key on the ordered tuples of
> > the relations and then do a cogroup. But I'm stuck on generating the key.
> > Since adding a field is a project, I assume this has to be done as part
> of
> > a foreach loop. But I'm not sure how I can maintain the order while
> adding
> > a field to each tuple.
> >
> > ideas?
> > Thanks,
> > lauren
> >
>
+
Alan Gates 2012-08-23, 20:43
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB