|
Søren
2012-12-04, 13:31
Edward Capriolo
2012-12-04, 14:43
Søren
2012-12-04, 14:58
Mark Grover
2012-12-05, 03:31
Vivek Mishra
2012-12-05, 10:06
Vivek Mishra
2012-12-05, 10:10
Søren
2012-12-06, 10:43
|
-
handling null argument in custom udfSøren 2012-12-04, 13:31
Hi Hive community
I have a custom udf, say myfun, written in Java which I utilize like this select myfun(col_a, col_b) from mytable where ....etc col_b is a string type and sometimes it is null. When that happens, my query crashes with --------------- java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"col_a":"val","col_b":null} ... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text --------------- public final class myfun extends UDF { public Text evaluate(final Text argA, final Text argB) { I'm unsure how this shouldbe fixed in a proper way.Is the framework looking for an overload of evaluate that would comply with the null argument? I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. Are there anyone out there with ideas or experiences on this issue? thanks in advance Søren
-
Re: handling null argument in custom udfEdward Capriolo 2012-12-04, 14:43
There is no null argument. You should handle the null case in your code.
If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren <[EMAIL PROTECTED]> wrote: > Hi Hive community > > I have a custom udf, say myfun, written in Java which I utilize like this > > select myfun(col_a, col_b) from mytable where ....etc > > col_b is a string type and sometimes it is null. > > When that happens, my query crashes with > --------------- > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row > {"col_a":"val","col_b":null} > ... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text > --------------- > > public final class myfun extends UDF { > public Text evaluate(final Text argA, final Text argB) { > > I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? > > I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. > > Are there anyone out there with ideas or experiences on this issue? > > thanks in advance > Søren > >
-
Re: handling null argument in custom udfSøren 2012-12-04, 14:58
Thanks. Did you mean I should handle null in my udf or my serde?
I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: .... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: > There is no null argument. You should handle the null case in your code. > > If (arga == null) > > Or optionally you could use a generic udf but a regular one should > handle what you are doing. > > On Tuesday, December 4, 2012, S�ren <[EMAIL PROTECTED] > <mailto:[EMAIL PROTECTED]>> wrote: > > Hi Hive community > > > > I have a custom udf, say myfun, written in Java which I utilize like > this > > > > select myfun(col_a, col_b) from mytable where ....etc > > > > col_b is a string type and sometimes it is null. > > > > When that happens, my query crashes with > > --------------- > > java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error > while processing row > > {"col_a":"val","col_b":null} > > ... > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable > to execute method public org.apache.hadoop.io.Text > > --------------- > > > > public final class myfun extends UDF { > > public Text evaluate(final Text argA, final Text argB) { > > > > I'm unsure how this should be fixed in a proper way. Is the > framework looking for an overload of evaluate that would comply with > the null argument? > > > > I need to say that the table is declared using my own json serde > reading from S3. I'm not processing nulls in my serde in any special > way because Hive seems to handle null in the right way when not passed > to my own UDF. > > > > Are there anyone out there with ideas or experiences on this issue? > > > > thanks in advance > > S�ren > > > >
-
Re: handling null argument in custom udfMark Grover 2012-12-05, 03:31
Soren,
Can you give the complete stack trace? Or share the code? Perhaps, the skeletal code. Look at Ceil UDF for example, it has a null check, you should be able to do something similar: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java#L43 I would encourage you in the long run to use GenericUDF though. They are better performing because they don't use reflection. I wrote a blog post a while back to get people started with UDFs. It's at: http://mark.thegrovers.ca/1/post/2012/06/how-to-write-a-hive-udf.html Perhaps, I should put the content on Apache wiki but in the meanwhile, take a look at it... Using the Translate UDF as an example(reference: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTranslate.java ) If you would like to have a column accept nulls: 1. Allow the argument type to be "void" type in initialize() like it's done at https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTranslate.java#L151 2. Handle null values appropriately in evaluate() like it's done at https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTranslate.java#L172 Good luck! Mark On Tue, Dec 4, 2012 at 6:58 AM, Søren <[EMAIL PROTECTED]> wrote: > Thanks. Did you mean I should handle null in my udf or my serde? > > I did try to check for null inside the code in my udf, but it fails even > before it gets called. > > This is from when the udf fails: > .... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > execute method public org.apache.hadoop.io.Text > com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) > on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with > arguments {0:java.lang.Object, null} of size 2 > > It looks like there is a null, or is this error message misleading? > > > > On 04/12/2012 15:43, Edward Capriolo wrote: > > There is no null argument. You should handle the null case in your code. > > If (arga == null) > > Or optionally you could use a generic udf but a regular one should handle > what you are doing. > > On Tuesday, December 4, 2012, Søren <[EMAIL PROTECTED]> wrote: > > Hi Hive community > > > > I have a custom udf, say myfun, written in Java which I utilize like this > > > > select myfun(col_a, col_b) from mytable where ....etc > > > > col_b is a string type and sometimes it is null. > > > > When that happens, my query crashes with > > --------------- > > java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > > {"col_a":"val","col_b":null} > > ... > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > execute method public org.apache.hadoop.io.Text > > --------------- > > > > public final class myfun extends UDF { > > public Text evaluate(final Text argA, final Text argB) { > > > > I'm unsure how this should be fixed in a proper way. Is the framework > looking for an overload of evaluate that would comply with the null > argument? > > > > I need to say that the table is declared using my own json serde reading > from S3. I'm not processing nulls in my serde in any special way because > Hive seems to handle null in the right way when not passed to my own UDF. > > > > Are there anyone out there with ideas or experiences on this issue? > > > > thanks in advance > > Søren > > > > > > >
-
RE: handling null argument in custom udfVivek Mishra 2012-12-05, 10:06
Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this?
-Vivek ________________________________________ From: Søren [[EMAIL PROTECTED]] Sent: 04 December 2012 20:28 To: [EMAIL PROTECTED] Subject: Re: handling null argument in custom udf Thanks. Did you mean I should handle null in my udf or my serde? I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: .... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: There is no null argument. You should handle the null case in your code. If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > Hi Hive community > > I have a custom udf, say myfun, written in Java which I utilize like this > > select myfun(col_a, col_b) from mytable where ....etc > > col_b is a string type and sometimes it is null. > > When that happens, my query crashes with > --------------- > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row > {"col_a":"val","col_b":null} > ... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text > --------------- > > public final class myfun extends UDF { > public Text evaluate(final Text argA, final Text argB) { > > I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? > > I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. > > Are there anyone out there with ideas or experiences on this issue? > > thanks in advance > Søren > > ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
-
RE: handling null argument in custom udfVivek Mishra 2012-12-05, 10:10
The way UDF works is, you need to tell your ObjectInspector about your primitive or JavaTypes. So in your case even if value is null, you should be able to assign it as a String or any other object. Then invocation to evaluate() function should know about type of java object.
-Vivek ________________________________________ From: Vivek Mishra Sent: 05 December 2012 15:36 To: [EMAIL PROTECTED] Subject: RE: handling null argument in custom udf Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this? -Vivek ________________________________________ From: Søren [[EMAIL PROTECTED]] Sent: 04 December 2012 20:28 To: [EMAIL PROTECTED] Subject: Re: handling null argument in custom udf Thanks. Did you mean I should handle null in my udf or my serde? I did try to check for null inside the code in my udf, but it fails even before it gets called. This is from when the udf fails: .... Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 It looks like there is a null, or is this error message misleading? On 04/12/2012 15:43, Edward Capriolo wrote: There is no null argument. You should handle the null case in your code. If (arga == null) Or optionally you could use a generic udf but a regular one should handle what you are doing. On Tuesday, December 4, 2012, Søren <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: > Hi Hive community > > I have a custom udf, say myfun, written in Java which I utilize like this > > select myfun(col_a, col_b) from mytable where ....etc > > col_b is a string type and sometimes it is null. > > When that happens, my query crashes with > --------------- > java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row > {"col_a":"val","col_b":null} > ... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text > --------------- > > public final class myfun extends UDF { > public Text evaluate(final Text argA, final Text argB) { > > I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? > > I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. > > Are there anyone out there with ideas or experiences on this issue? > > thanks in advance > Søren > > ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. ________________________________ NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
-
Re: handling null argument in custom udfSøren 2012-12-06, 10:43
Right. Thanks for all the help.
It turned out that it did help to check for null in the code. No mystery. I did try that earlier but the attempt got lost somehow. Thanks for the advise on using GenericUDF. cheers S�ren On 05/12/2012 11:10, Vivek Mishra wrote: > The way UDF works is, you need to tell your ObjectInspector about your primitive or JavaTypes. So in your case even if value is null, you should be able to assign it as a String or any other object. Then invocation to evaluate() function should know about type of java object. > > -Vivek > ________________________________________ > From: Vivek Mishra > Sent: 05 December 2012 15:36 > To: [EMAIL PROTECTED] > Subject: RE: handling null argument in custom udf > > Could you please look into and share your task log/attemptlog for complete error trace or actual error behind this? > > -Vivek > ________________________________________ > From: S�ren [[EMAIL PROTECTED]] > Sent: 04 December 2012 20:28 > To: [EMAIL PROTECTED] > Subject: Re: handling null argument in custom udf > > Thanks. Did you mean I should handle null in my udf or my serde? > > I did try to check for null inside the code in my udf, but it fails even before it gets called. > > This is from when the udf fails: > .... > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.company.hive.myfun.evaluate(java.lang.Object,java.lang.Object) > on objectcom.company.hive.myfun@1412332 of class com.company.hive.myfun with arguments {0:java.lang.Object, null} of size 2 > > It looks like there is a null, or is this error message misleading? > > > On 04/12/2012 15:43, Edward Capriolo wrote: > There is no null argument. You should handle the null case in your code. > > If (arga == null) > > Or optionally you could use a generic udf but a regular one should handle what you are doing. > > On Tuesday, December 4, 2012, S�ren <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: >> Hi Hive community >> >> I have a custom udf, say myfun, written in Java which I utilize like this >> >> select myfun(col_a, col_b) from mytable where ....etc >> >> col_b is a string type and sometimes it is null. >> >> When that happens, my query crashes with >> --------------- >> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row >> {"col_a":"val","col_b":null} >> ... >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text >> --------------- >> >> public final class myfun extends UDF { >> public Text evaluate(final Text argA, final Text argB) { >> >> I'm unsure how this should be fixed in a proper way. Is the framework looking for an overload of evaluate that would comply with the null argument? >> >> I need to say that the table is declared using my own json serde reading from S3. I'm not processing nulls in my serde in any special way because Hive seems to handle null in the right way when not passed to my own UDF. >> >> Are there anyone out there with ideas or experiences on this issue? >> >> thanks in advance >> S�ren >> >> > > ________________________________ > > > > > > > NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. > > ________________________________ > > > > > > > NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. |