|
|
+
Eli Finkelshteyn 2012-02-09, 18:26
+
Yulia Tolskaya 2012-02-09, 19:14
+
Eli Finkelshteyn 2012-02-09, 21:20
+
Brendan Gill 2012-02-10, 12:56
-
Re: Flatten a Bag on One Line?pablomar 2012-02-10, 16:09
what about something like this?
(typing on the phone, forgive any mistake) public class Flat extends EvalFunc <Tuple> { public Tuple exec(Tuple input) throws IOException { try { List <Object> list = new LinkedList<Object>(); DataBag bag = (DataBag)input.get(0); Iterator it = bag.iterator(); while(it.hasNext()) { Tuple t = (Tuple)it.next(); if(t != null && t.size()>0) list.add(t.get(0)); } TupleFactory fac = TupleFactory.getInstance(); return fac.newTuple(list); } catch.... On 2/10/12, Brendan Gill <[EMAIL PROTECTED]> wrote: > Eli, > > I'm trying to do exactly this, but am pretty new to Pig. Any chance you > would share what the UDF would look like? Then I can tailor it to our > needs. > > Much appreciated if possible, > > Brendan > > > > On Thu, Feb 9, 2012 at 9:20 PM, Eli Finkelshteyn <[EMAIL PROTECTED]> wrote: > >> Thanks. Was hoping/assuming there was a built-in, but I guess udf it is. >> >> Eli >> >> >> On 2/9/12 2:14 PM, Yulia Tolskaya wrote: >> >>> I actually can't think of an easy way to do this without it becoming a >>> cross product. You could just right a really simple udf that takes a bag >>> and spits out just the members. >>> >>> Yulia >>> >>> On 2/9/12 1:26 PM, "Eli >>> Finkelshteyn"<iefinkel@gmail.**com<[EMAIL PROTECTED]>> >>> wrote: >>> >>> This is probably easy, but my PigLatin is rusty, and I don't seem to be >>>> able to find an answer on Google. If I have a record of the form: >>>> >>>> 98812 3 {(48567859),(15996334),(**15897772)} >>>> >>>> How can I flatten that bag to leave all members on a single row, ie: >>>> >>>> 98812 3 48567859 15996334 15897772 >>>> >>>> Cheers, >>>> Eli >>>> >>> >> > +
Eli Finkelshteyn 2012-02-10, 21:14
+
Thejas Nair 2012-02-11, 00:07
+
Eli Finkelshteyn 2012-02-13, 06:36
|