|
|
-
Re: easiest way to get loops in PIG?Yang 2012-06-21, 04:33
well, I tried to give a pure pig latin version without using Udf, but now
it seems too cumbersome without udf: 2 of the problems I can't solve is: for a bag [ (id : int, extra_info: chararray)], how to generate the tuple with the smallest id. another is : is there a version of MIN() for chararray? Thanks yang On Wed, Jun 20, 2012 at 7:22 PM, Yang <[EMAIL PROTECTED]> wrote: > hehehe, thanks, I'll post my version slightly later :) > > > On Wed, Jun 20, 2012 at 7:19 PM, Norbert Burger <[EMAIL PROTECTED]>wrote: > >> Yang -- have you seen Hortonworks' blogpost on this? >> >> http://hortonworks.com/blog/transitive-closure-in-apache-pig/ >> >> Norbert >> >> On Wed, Jun 20, 2012 at 10:15 PM, Prashant Kommireddi >> <[EMAIL PROTECTED]>wrote: >> >> > Would embedding Pig in java or other languages work? >> > >> > http://pig.apache.org/docs/r0.10.0/cont.html#embed-java >> > >> > >> > On Jun 20, 2012, at 7:12 PM, Yang <[EMAIL PROTECTED]> wrote: >> > >> > > I agree that pig does not have loop probably for a good reason. >> > > >> > > but currently I need to write a code to find the transitive closures >> of >> > > many edges in a graph. >> > > so I need to iterate a code snippet several times, so finally I can >> find >> > a >> > > connected component of size 2^N >> > > >> > > right now I just copy-paste the snippet several times. >> > > >> > > I guess I could take out the snippet and make it into a separate pig >> > > script, and load and store intermediate data >> > > at the beginning and end. but loading data is kind of a waste. >> > > >> > > any suggestions? >> > > >> > > Thanks >> > > Yang >> > >> > > |