Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Flatten a Bag on One Line?


Copy link to this message
-
Re: Flatten a Bag on One Line?
Pig doesn't have a piggybank for python udfs, but it makes sense to
create one.
Please attach your udf to a a new jira, and we can figure where to put it .

-Thejas
On 2/10/12 1:14 PM, Eli Finkelshteyn wrote:
> I was going to do this as a python udf, but haven't had a chance yet
> since other stuff I was working on took priority. As soon as I do write
> it, I'll be sure to upload it here. On a related note: is there a
> piggybank for python udfs I could contribute it to for posterity?
>
> Eli
>
> On 2/10/12 11:09 AM, pablomar wrote:
>> what about something like this?
>> (typing on the phone, forgive any mistake)
>>
>> public class Flat extends EvalFunc<Tuple>
>> {
>> public Tuple exec(Tuple input) throws IOException
>> {
>> try
>> {
>> List<Object> list = new LinkedList<Object>();
>> DataBag bag = (DataBag)input.get(0);
>> Iterator it = bag.iterator();
>> while(it.hasNext())
>> {
>> Tuple t = (Tuple)it.next();
>> if(t != null&& t.size()>0)
>> list.add(t.get(0));
>> }
>>
>> TupleFactory fac = TupleFactory.getInstance();
>> return fac.newTuple(list);
>> }
>> catch....
>>
>> On 2/10/12, Brendan Gill<[EMAIL PROTECTED]> wrote:
>>> Eli,
>>>
>>> I'm trying to do exactly this, but am pretty new to Pig. Any chance you
>>> would share what the UDF would look like? Then I can tailor it to our
>>> needs.
>>>
>>> Much appreciated if possible,
>>>
>>> Brendan
>>>
>>>
>>>
>>> On Thu, Feb 9, 2012 at 9:20 PM, Eli Finkelshteyn<[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> Thanks. Was hoping/assuming there was a built-in, but I guess udf it
>>>> is.
>>>>
>>>> Eli
>>>>
>>>>
>>>> On 2/9/12 2:14 PM, Yulia Tolskaya wrote:
>>>>
>>>>> I actually can't think of an easy way to do this without it becoming a
>>>>> cross product. You could just right a really simple udf that takes
>>>>> a bag
>>>>> and spits out just the members.
>>>>>
>>>>> Yulia
>>>>>
>>>>> On 2/9/12 1:26 PM, "Eli
>>>>> Finkelshteyn"<iefinkel@gmail.**com<[EMAIL PROTECTED]>>
>>>>> wrote:
>>>>>
>>>>> This is probably easy, but my PigLatin is rusty, and I don't seem
>>>>> to be
>>>>>> able to find an answer on Google. If I have a record of the form:
>>>>>>
>>>>>> 98812 3 {(48567859),(15996334),(**15897772)}
>>>>>>
>>>>>> How can I flatten that bag to leave all members on a single row, ie:
>>>>>>
>>>>>> 98812 3 48567859 15996334 15897772
>>>>>>
>>>>>> Cheers,
>>>>>> Eli
>>>>>>
>