Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Eval UDF passing parameters


Copy link to this message
-
Re: Eval UDF passing parameters
Dexin Wang 2010-12-07, 20:09
ah nice. Thank you so much Zach!

On Tue, Dec 7, 2010 at 11:47 AM, Zach Bailey <[EMAIL PROTECTED]>wrote:

>
>  You can pass parameters via the UDF constructor. For example:
>
>
> public MyUDF(boolean includeAge, boolean includeGender)
>
>
> then you would initialize it like so in your pig script:
>
>
> define MY_UDF_ONLY_AGE com.package.MyUDF(true, false)
>
>
> and use it like:
>
>
> data_with_age = FOREACH data GENERATE user_id, MY_UDF_ONLY_AGE(user_id);
>
>
> HTH,
> Zach
>
>
> On Tuesday, December 7, 2010 at 2:44 PM, Dexin Wang wrote:
>
> > Hi,
> >
> > This might be a dumb question. Is it possible to pass anything other than
> > the input tuple to a UDF Eval function?
> >
> > Basically in my UDF, I need to do some user info lookup. So the input
> will
> > be:
> >
> > (userid,f1,f2)
> >
> > with this UDF, I want to convert it to something like
> >
> > (userid,age,gender,location,f1,f2)
> >
> > where in the UDF I do a DB lookup on the userid and returns user's info
> > (age, gender, etc). But I don't necessarily want to pass back the same
> user
> > info fields, e.g. sometimes I only want age.
> >
> > I hope there is a way for me to tell the UDF that I only want "age", and
> > sometimes "age, location", etc.
> >
> > What's the best way to achieve this without having to write a separate
> UDF
> > for every case?
> >
> > Thanks.
> > Dexin
> >
> >
> >
> >
>
>
>