|
|
-
Hive UDF error : numberformat exception (String to Integer) conversion
praveenesh kumar 2012-05-30, 14:40
Hello Hive Users,
There is a strange situation I am facing.
I have a string column in my Hive table ( its IP address). I am creating a UDF where I am taking this string column and converting it into Long value. Its a simple UDF. Following is my code :
package com.practice.hive.udf; public class IPtoINT extends UDF { public static LongWritable execute(Text addr) {
String[] addrArray = addr.toString().split("\\.");
long num = 0;
for (int i=0;i<addrArray.length;i++) { int power = 3-i; num += ((Integer.parseInt(addrArray[i])%256 * Math.pow(256,power))); } return new LongWritable(num); } }
After creating jar, I am running the following commands:
$ hive hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; hive > create temporary function ip2int as 'com.practice.hive.udf.IPtoINT'; hive > select ip2int(ip1) from sample_data;
But running the above, is giving me the following error:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public org.apache.hadoop.io.Text com.musigma.hive.udf.ip2int.evaluate(org.apache.hadoop.io.Text) on object com.musigma.hive.udf.ip2int@19a4d79 of class com.musigma.hive.udf.ip2int with arguments {1.0.144.36:org.apache.hadoop.io.Text} of size 1 at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:848) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) ... 9 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:824) ... 18 more Caused by: java.lang.NumberFormatException: For input string: "1" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:441) at java.lang.Long.<init>(Long.java:702) at com.musigma.hive.udf.ip2int.evaluate(ip2int.java:11) ... 23 more If I am running the HIVE UDF like --- select ip2int("102.134.123.1") from sample_data; Its not giving any error. Strange thing is, its numberformat exception. I am not able to use the string into int in my java. Very strange issue.
Can someone please tell me what stupid mistake I am doing ?
Is there any other UDF that does string to int/long coversion ?
Regards, Praveenesh
-
Re: Hive UDF error : numberformat exception (String to Integer) conversion
Edward Capriolo 2012-05-30, 14:44
You should to try catch and return NULL on bad data. The issue is if you have a single bad row the UDF will throw a exception up the chain. It will try again, it will fail again, ultimately the job will fail.
On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <[EMAIL PROTECTED]> wrote: > Hello Hive Users, > > There is a strange situation I am facing. > > I have a string column in my Hive table ( its IP address). I am creating a > UDF where I am taking this string column and converting it into Long value. > Its a simple UDF. Following is my code : > > package com.practice.hive.udf; > public class IPtoINT extends UDF { > public static LongWritable execute(Text addr) { > > String[] addrArray = addr.toString().split("\\."); > > long num = 0; > > for (int i=0;i<addrArray.length;i++) { > int power = 3-i; > num += ((Integer.parseInt(addrArray[i])%256 * > Math.pow(256,power))); > } > return new LongWritable(num); > } > } > > After creating jar, I am running the following commands: > > $ hive > hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; > hive > create temporary function ip2int as 'com.practice.hive.udf.IPtoINT'; > hive > select ip2int(ip1) from sample_data; > > But running the above, is giving me the following error: > > java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > at > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native > Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) > at > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) > ... 8 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > execute method public org.apache.hadoop.io.Text > com.musigma.hive.udf.ip2int.evaluate(org.apache.hadoop.io.Text) on object > com.musigma.hive.udf.ip2int@19a4d79 of class com.musigma.hive.udf.ip2int > with arguments {1.0.144.36:org.apache.hadoop.io.Text} of size 1 > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:848) > at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) > at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) > at > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > at > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:531) > ... 9 more
-
Re: Hive UDF error : numberformat exception (String to Integer) conversion
Nitin Pawar 2012-05-30, 14:47
i won't tell the error but I would recommend to write a main function in your udf and try with sample inputs which you are expecting in your query.
You will know whats the error you are committing
On Wed, May 30, 2012 at 8:14 PM, Edward Capriolo <[EMAIL PROTECTED]>wrote:
> You should to try catch and return NULL on bad data. The issue is if > you have a single bad row the UDF will throw a exception up the chain. > It will try again, it will fail again, ultimately the job will fail. > > On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <[EMAIL PROTECTED]> > wrote: > > Hello Hive Users, > > > > There is a strange situation I am facing. > > > > I have a string column in my Hive table ( its IP address). I am creating > a > > UDF where I am taking this string column and converting it into Long > value. > > Its a simple UDF. Following is my code : > > > > package com.practice.hive.udf; > > public class IPtoINT extends UDF { > > public static LongWritable execute(Text addr) { > > > > String[] addrArray = addr.toString().split("\\."); > > > > long num = 0; > > > > for (int i=0;i<addrArray.length;i++) { > > int power = 3-i; > > num += ((Integer.parseInt(addrArray[i])%256 * > > Math.pow(256,power))); > > } > > return new LongWritable(num); > > } > > } > > > > After creating jar, I am running the following commands: > > > > $ hive > > hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; > > hive > create temporary function ip2int as > 'com.practice.hive.udf.IPtoINT'; > > hive > select ip2int(ip1) from sample_data; > > > > But running the above, is giving me the following error: > > > > java.lang.RuntimeException: > > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error > while > > processing row > > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > > at > > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) > > at > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > > at > > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > > at java.security.AccessController.doPrivileged(Native > > Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) > > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > > Error while processing row > > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} > > at > > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) > > at > > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) > > ... 8 more > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to > > execute method public org.apache.hadoop.io.Text > > com.musigma.hive.udf.ip2int.evaluate(org.apache.hadoop.io.Text) on > object > > com.musigma.hive.udf.ip2int@19a4d79 of class com.musigma.hive.udf.ip2int > > with arguments {1.0.144.36:org.apache.hadoop.io.Text} of size 1 > > at > > > org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:848) > > at > > > org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.evaluate(GenericUDFBridge.java:181) > > at > > > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:163) > > at > > > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:76) > > at > > org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) > > at > > org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
Nitin Pawar
-
Re: Hive UDF error : numberformat exception (String to Integer) conversion
praveenesh kumar 2012-05-30, 16:14
I have done both the things. There is no null issue here. Checked the nulls also. Sorry not mentioned in the code. I also have made a main function and called my evaluate function. If I am passing a string, its working fine.
Problem is of numberformat exception. Integer.parseInt is throwing this.. I don't know why... I am converting hadoop's Text object to String.. Splitting it.. converting into String array.. and giving String inside is giving me problems. What could be the reason.. I know its not at all hive related..Its java mistake only. Please help me out in resolving this issue. Sounds embarassing, but don't know why I am not able to see the mistake I am doing.
Regards, Praveenesh
On Wed, May 30, 2012 at 8:17 PM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
> i won't tell the error but I would recommend to write a main function in > your udf and try with sample inputs which you are expecting in your query. > > You will know whats the error you are committing > > > On Wed, May 30, 2012 at 8:14 PM, Edward Capriolo <[EMAIL PROTECTED]>wrote: > >> You should to try catch and return NULL on bad data. The issue is if >> you have a single bad row the UDF will throw a exception up the chain. >> It will try again, it will fail again, ultimately the job will fail. >> >> On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <[EMAIL PROTECTED]> >> wrote: >> > Hello Hive Users, >> > >> > There is a strange situation I am facing. >> > >> > I have a string column in my Hive table ( its IP address). I am >> creating a >> > UDF where I am taking this string column and converting it into Long >> value. >> > Its a simple UDF. Following is my code : >> > >> > package com.practice.hive.udf; >> > public class IPtoINT extends UDF { >> > public static LongWritable execute(Text addr) { >> > >> > String[] addrArray = addr.toString().split("\\."); >> > >> > long num = 0; >> > >> > for (int i=0;i<addrArray.length;i++) { >> > int power = 3-i; >> > num += ((Integer.parseInt(addrArray[i])%256 * >> > Math.pow(256,power))); >> > } >> > return new LongWritable(num); >> > } >> > } >> > >> > After creating jar, I am running the following commands: >> > >> > $ hive >> > hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; >> > hive > create temporary function ip2int as >> 'com.practice.hive.udf.IPtoINT'; >> > hive > select ip2int(ip1) from sample_data; >> > >> > But running the above, is giving me the following error: >> > >> > java.lang.RuntimeException: >> > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error >> while >> > processing row >> > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} >> > at >> > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) >> > at >> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >> > at >> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) >> > at >> org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >> > at java.security.AccessController.doPrivileged(Native >> > Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >> > at org.apache.hadoop.mapred.Child.main(Child.java:249) >> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive >> Runtime >> > Error while processing row >> > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} >> > at >> > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:550) >> > at >> > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143) >> > ... 8 more >> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to >> > execute method public org.apache.hadoop.io.Text
-
Re: Hive UDF error : numberformat exception (String to Integer) conversion
Edward Capriolo 2012-05-30, 16:16
Again. Suggest trapping exceptions with try/catch and return null. If you initialize a logger with log4j or commons logging your can log the event and find the failure information by clicking though the job tracker web interface to drill down to the error.
On Wed, May 30, 2012 at 12:14 PM, praveenesh kumar <[EMAIL PROTECTED]> wrote: > I have done both the things. > There is no null issue here. Checked the nulls also. Sorry not mentioned in > the code. > I also have made a main function and called my evaluate function. If I am > passing a string, its working fine. > > Problem is of numberformat exception. > Integer.parseInt is throwing this.. I don't know why... I am converting > hadoop's Text object to String.. Splitting it.. converting into String > array.. and giving String inside is giving me problems. What could be the > reason.. I know its not at all hive related..Its java mistake only. Please > help me out in resolving this issue. Sounds embarassing, but don't know why > I am not able to see the mistake I am doing. > > Regards, > Praveenesh > > > > On Wed, May 30, 2012 at 8:17 PM, Nitin Pawar <[EMAIL PROTECTED]> > wrote: >> >> i won't tell the error but I would recommend to write a main function in >> your udf and try with sample inputs which you are expecting in your query. >> >> You will know whats the error you are committing >> >> >> On Wed, May 30, 2012 at 8:14 PM, Edward Capriolo <[EMAIL PROTECTED]> >> wrote: >>> >>> You should to try catch and return NULL on bad data. The issue is if >>> you have a single bad row the UDF will throw a exception up the chain. >>> It will try again, it will fail again, ultimately the job will fail. >>> >>> On Wed, May 30, 2012 at 10:40 AM, praveenesh kumar <[EMAIL PROTECTED]> >>> wrote: >>> > Hello Hive Users, >>> > >>> > There is a strange situation I am facing. >>> > >>> > I have a string column in my Hive table ( its IP address). I am >>> > creating a >>> > UDF where I am taking this string column and converting it into Long >>> > value. >>> > Its a simple UDF. Following is my code : >>> > >>> > package com.practice.hive.udf; >>> > public class IPtoINT extends UDF { >>> > public static LongWritable execute(Text addr) { >>> > >>> > String[] addrArray = addr.toString().split("\\."); >>> > >>> > long num = 0; >>> > >>> > for (int i=0;i<addrArray.length;i++) { >>> > int power = 3-i; >>> > num += ((Integer.parseInt(addrArray[i])%256 * >>> > Math.pow(256,power))); >>> > } >>> > return new LongWritable(num); >>> > } >>> > } >>> > >>> > After creating jar, I am running the following commands: >>> > >>> > $ hive >>> > hive > add jar /home/hadoop/Desktop/HiveData/IPtoINT.jar; >>> > hive > create temporary function ip2int as >>> > 'com.practice.hive.udf.IPtoINT'; >>> > hive > select ip2int(ip1) from sample_data; >>> > >>> > But running the above, is giving me the following error: >>> > >>> > java.lang.RuntimeException: >>> > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error >>> > while >>> > processing row >>> > {"ip1":"1.0.144.36","ip2":16814116,"country":"Thailand","key":null} >>> > at >>> > org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161) >>> > at >>> > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) >>> > at >>> > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436) >>> > at >>> > org.apache.hadoop.mapred.MapTask.run(MapTask.java:372) >>> > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>> > at java.security.AccessController.doPrivileged(Native >>> > Method) >>> > at javax.security.auth.Subject.doAs(Subject.java:415) >>> > at >>> > >>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) >>> > at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive
|
|