Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Calling same UDF multiple times in a SELECT query


Copy link to this message
-
Re: Calling same UDF multiple times in a SELECT query
Thanks Jan

I will mod my UDF and test it out

I want to make sure I understand your words here
"The obvious condition is that it must always return the identical result when called with same parameters."

If I can make sure that a call to the web service is successful it will always return same output for a given set of input

F(x1,y1) ---->will always equal -----> z1

that’s what u mean right ?

sanjay

From: Jan Dolinár <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Tuesday, July 23, 2013 12:35 PM
To: user <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: Re: Calling same UDF multiple times in a SELECT query

Hi,

If you use annotation, Hive should be able to optimize it to single call:

 @UDFType(deterministic = true)

The obvious condition is that it must always return the identical result when called with same parameters.

Little bit more on this can be found in Mark Grovers post at http://mark.thegrovers.ca/1/post/2012/06/how-to-write-a-hive-udf.html.

Regards,
Jan
On Tue, Jul 23, 2013 at 9:25 PM, Nitin Pawar <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
fucntion return values are not stored for repeat use of same (as per my understanding)

I know you may have already thought about other approach as

select a , if (call <-1, -1 call) as b from (select a, fooudf(a) as call from table
On Wed, Jul 24, 2013 at 12:42 AM, Sanjay Subramanian <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi

V r using version hive-exec-0.9.0-cdh4.1.2 in production

I need to check and use the output from a UDF in a query to assign values to 2 columns in a SELECT query

Example

SELECT
     a,
     IF(fooUdf(a) < -1  , -1, fooUdf(a)) as b,
     IF(fooUdf(a) < -1  , fooUdf(a), 0) as c
FROM
     my_hive_table
So will fooUdf be called 4 times ? Or once ?

Why this is important is because in our case this UDF calls a web service and I don't want so many calls to the service.

Thanks

sanjay

CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.

--
Nitin Pawar
CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB