Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Get field from bag with constraints from same relation


+
Thomas Bach 2013-01-22, 11:55
+
Thomas Bach 2013-01-22, 16:24
Copy link to this message
-
Re: Get field from bag with constraints from same relation
Hi Thomas,

Try this:

data1 = LOAD '1.txt' USING PigStorage('|') AS (n:int,
B:bag{(m:int,s:chararray)});
data2 = FOREACH data1 GENERATE n, FLATTEN(B);
data3 = FILTER data2 BY B::m <= n;
data4 = GROUP data3 BY n;
data5 = FOREACH data4 {
    data6 = ORDER data3 BY B::m DESC;
    data7 = LIMIT data6 1;
    GENERATE data7;
}
data8 = FOREACH data5 GENERATE FLATTEN(data7);
data9 = FOREACH data8 GENERATE n, B::s;
DUMP data9;

The input is:
4|{(1,abc),(2,cde),(5,efg)}
2|{(1,foo),(2,bar),(5,baz)}
7|{(1,bounce),(2,frotz),(5,trotz)}

The output is:
(2,bar)
(4,cde)
(7,trotz)

Thanks,
Cheolsoo
On Tue, Jan 22, 2013 at 8:24 AM, Thomas Bach
<[EMAIL PROTECTED]>wrote:

> On Tue, Jan 22, 2013 at 12:55:22PM +0100, Thomas Bach wrote:
> > Hi there,
> >
> > I have the following data
> >
> > 4     {(1,abc),(2,cde),(5,efg)}
> > 2     {(1,foo),(2,bar),(5,baz)}
> > 7     {(1,bounce),(2,frotz),(5,trotz)}
> >
> > what I finally want to achieve is a list of all strings related to the
> > largest number in the tuple that is less-equal the first number in
> > the row. i.e.:
> >
> > (4,cde)
> > (2,bar)
> > (5,trotz)
> >
>
> This should be
>
> (4,cde)
> (2,bar)
> (7,trotz)
>
> of course.
>
> Regards,
>         Thomas Bach.
>
+
Thomas Bach 2013-01-23, 14:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB