|
|
-
Regarding Eval UDF in pig 0.9.1
Rohini U 2011-12-20, 23:32
Hi,
I am using a static HashMap in EvalUDF which needs configuration, so I am initializing it in exec method checking if it is null @Override public String exec(Tuple input) throws IOException { if(dict== null){
dict=MyDictionary.getInstance(UDFContext.getUDFContext().getJobConf()); } // Other piece of code here
}
Now, in the getInstance method, do I have to take care of any thread synchronization issues? Is there a chance that multiple threads access it?
Thanks, -Rohini
-
Re: Regarding Eval UDF in pig 0.9.1
Thejas Nair 2011-12-20, 23:56
Pig does not use multiple threads for executing the udf (at least in versions so far, and i haven't seen any proposals to change that). So you don't need to deal with synchronization issues. But if you are using a static variables, remember that there can be multiple instances of the udf - one for each place in the pig-latin script where you use the udf.
-Thejas On 12/20/11 3:32 PM, Rohini U wrote: > Hi, > > I am using a static HashMap in EvalUDF which needs configuration, so I am > initializing it in exec method checking if it is null > > > @Override > public String exec(Tuple input) throws IOException { > if(dict== null){ > > dict=MyDictionary.getInstance(UDFContext.getUDFContext().getJobConf()); > } > // Other piece of code here > > } > > Now, in the getInstance method, do I have to take care of any thread > synchronization issues? Is there a chance that multiple threads access it? > > Thanks, > -Rohini >
-
Re: Regarding Eval UDF in pig 0.9.1
Jonathan Coveney 2011-12-21, 01:26
The UDFContext doesn't really do what it seems like it should do, and as Thejas said, if you have multiple instances of the UDF, it will clash.
This reminds me that the work that was done to get the Schema on the backend should allow us to pass a proper Context object as well.
2011/12/20 Thejas Nair <[EMAIL PROTECTED]>
> Pig does not use multiple threads for executing the udf (at least in > versions so far, and i haven't seen any proposals to change that). > So you don't need to deal with synchronization issues. But if you are > using a static variables, remember that there can be multiple instances of > the udf - one for each place in the pig-latin script where you use the udf. > > -Thejas > > > > > > On 12/20/11 3:32 PM, Rohini U wrote: > >> Hi, >> >> I am using a static HashMap in EvalUDF which needs configuration, so I am >> initializing it in exec method checking if it is null >> >> >> @Override >> public String exec(Tuple input) throws IOException { >> if(dict== null){ >> >> dict=MyDictionary.getInstance(**UDFContext.getUDFContext().** >> getJobConf()); >> } >> // Other piece of code here >> >> } >> >> Now, in the getInstance method, do I have to take care of any thread >> synchronization issues? Is there a chance that multiple threads access it? >> >> Thanks, >> -Rohini >> >> >
-
Re: Regarding Eval UDF in pig 0.9.1
Thejas Nair 2011-12-21, 20:01
On 12/20/11 5:26 PM, Jonathan Coveney wrote: > The UDFContext doesn't really do what it seems like it should do, and as > Thejas said, if you have multiple instances of the UDF, it will clash. >
If she is using UDFContext for just reading the configuration, it should be fine. To avoid clashing while setting udf properties in UDFContext, the EvalFunc signature should be passed on in the args for getUDFProperties(Class c, String[] args) call.
-Thejas
> This reminds me that the work that was done to get the Schema on the > backend should allow us to pass a proper Context object as well. > > 2011/12/20 Thejas Nair<[EMAIL PROTECTED]> > >> Pig does not use multiple threads for executing the udf (at least in >> versions so far, and i haven't seen any proposals to change that). >> So you don't need to deal with synchronization issues. But if you are >> using a static variables, remember that there can be multiple instances of >> the udf - one for each place in the pig-latin script where you use the udf. >> >> -Thejas >> >> >> >> >> >> On 12/20/11 3:32 PM, Rohini U wrote: >> >>> Hi, >>> >>> I am using a static HashMap in EvalUDF which needs configuration, so I am >>> initializing it in exec method checking if it is null >>> >>> >>> @Override >>> public String exec(Tuple input) throws IOException { >>> if(dict== null){ >>> >>> dict=MyDictionary.getInstance(**UDFContext.getUDFContext().** >>> getJobConf()); >>> } >>> // Other piece of code here >>> >>> } >>> >>> Now, in the getInstance method, do I have to take care of any thread >>> synchronization issues? Is there a chance that multiple threads access it? >>> >>> Thanks, >>> -Rohini >>> >>> >> >
-
Re: Regarding Eval UDF in pig 0.9.1
Rohini U 2011-12-21, 22:31
Value is set by my pig script only and in the UDF I just want to read it. I wont be setting it.
Thanks -Rohini
On Wed, Dec 21, 2011 at 12:01 PM, Thejas Nair <[EMAIL PROTECTED]>wrote:
> On 12/20/11 5:26 PM, Jonathan Coveney wrote: > >> The UDFContext doesn't really do what it seems like it should do, and as >> Thejas said, if you have multiple instances of the UDF, it will clash. >> >> > If she is using UDFContext for just reading the configuration, it should > be fine. > To avoid clashing while setting udf properties in UDFContext, the EvalFunc > signature should be passed on in the args for getUDFProperties(Class c, > String[] args) call. > > -Thejas > > > > > This reminds me that the work that was done to get the Schema on the >> backend should allow us to pass a proper Context object as well. >> >> 2011/12/20 Thejas Nair<[EMAIL PROTECTED]> >> >> Pig does not use multiple threads for executing the udf (at least in >>> versions so far, and i haven't seen any proposals to change that). >>> So you don't need to deal with synchronization issues. But if you are >>> using a static variables, remember that there can be multiple instances >>> of >>> the udf - one for each place in the pig-latin script where you use the >>> udf. >>> >>> -Thejas >>> >>> >>> >>> >>> >>> On 12/20/11 3:32 PM, Rohini U wrote: >>> >>> Hi, >>>> >>>> I am using a static HashMap in EvalUDF which needs configuration, so I >>>> am >>>> initializing it in exec method checking if it is null >>>> >>>> >>>> @Override >>>> public String exec(Tuple input) throws IOException { >>>> if(dict== null){ >>>> >>>> dict=MyDictionary.getInstance(****UDFContext.getUDFContext().**** >>>> getJobConf()); >>>> } >>>> // Other piece of code here >>>> >>>> } >>>> >>>> Now, in the getInstance method, do I have to take care of any thread >>>> synchronization issues? Is there a chance that multiple threads access >>>> it? >>>> >>>> Thanks, >>>> -Rohini >>>> >>>> >>>> >>> >> >
|
|