Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Regarding Eval UDF in pig 0.9.1


Copy link to this message
-
Re: Regarding Eval UDF in pig 0.9.1
On 12/20/11 5:26 PM, Jonathan Coveney wrote:
> The UDFContext doesn't really do what it seems like it should do, and as
> Thejas said, if you have multiple instances of the UDF, it will clash.
>

If she is using UDFContext for just reading the configuration, it should
be fine.
To avoid clashing while setting udf properties in UDFContext, the
EvalFunc signature should be passed on in the args for
getUDFProperties(Class c, String[] args) call.

-Thejas

> This reminds me that the work that was done to get the Schema on the
> backend should allow us to pass a proper Context object as well.
>
> 2011/12/20 Thejas Nair<[EMAIL PROTECTED]>
>
>> Pig does not use multiple threads for executing the udf (at least in
>> versions so far, and i haven't seen any proposals to change that).
>> So you don't need to deal with synchronization issues. But if you are
>> using a static variables, remember that there can be multiple instances of
>> the udf - one for each place in the pig-latin script where you use the udf.
>>
>> -Thejas
>>
>>
>>
>>
>>
>> On 12/20/11 3:32 PM, Rohini U wrote:
>>
>>> Hi,
>>>
>>> I am using a static HashMap in EvalUDF which needs configuration, so I am
>>> initializing it in exec method checking if it is null
>>>
>>>
>>>      @Override
>>>      public String exec(Tuple input) throws IOException {
>>>                if(dict== null){
>>>
>>> dict=MyDictionary.getInstance(**UDFContext.getUDFContext().**
>>> getJobConf());
>>>           }
>>>      // Other piece of code here
>>>
>>>     }
>>>
>>> Now, in the getInstance method, do I have to take care of any thread
>>> synchronization issues? Is there a chance that multiple threads access it?
>>>
>>> Thanks,
>>> -Rohini
>>>
>>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB