Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> add a field, ordered


Copy link to this message
-
Re: add a field, ordered
Hi,

We are finalizing a feature that would solve your problems, something like
ROW_NUMBER in some SQL dialect, we call it RANK.
This operator will add a unique consecutive row number to each tuple in the
relationship.
Then you will be able to join the two relationships on the rank field.

For the moment being, however, I think there is no easy way to achieve what
you want to do.

Cheers,
--
Gianmarco

On Tue, Aug 14, 2012 at 11:55 AM, Lauren Blau <
[EMAIL PROTECTED]> wrote:

> I  want to match up tuples from 2 relations. For each key, the 2 relations
> will always have the same number of tuples and match by position (the first
> tuple in each are a match, the second tuple in each, etc).
>
> so if I have
> relation1 = 5,9,7
> relation2 = z,a,d
>
> I want to end up with
>
> relation3 = (5,z),(9,a),(7,d)
>
> I figure I need a way to generate a matching key on the ordered tuples of
> the relations and then do a cogroup. But I'm stuck on generating the key.
> Since adding a field is a project, I assume this has to be done as part of
> a foreach loop. But I'm not sure how I can maintain the order while adding
> a field to each tuple.
>
> ideas?
> Thanks,
> lauren
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB