|
|
-
Re: Converting Array to a String
Tucker, Matt 2011-11-17, 14:41
I'm running into the same issue, and I see that it's addressed in HIVE-2223.
In the meantime, I'm getting an error when trying to use the reflect() function : SELECT reflect("org.apache.commons.lang.StringUtils", "join", collectedSet), ...
FAILED: Error in semantic analysis: Line 1:69 Argument type mismatch collectedSet: The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...]) must be primitive (int, double, string, etc).
Matt Tucker Associate eBusiness Analyst Walt Disney Parks and Resorts Online Ph: 407-566-2545 Tie: 8-296-2545
+
Tucker, Matt 2011-11-17, 14:41
-
Re: Converting Array to a String
Matt Martin 2011-11-18, 23:52
As currently implemented, the parameters passed to the reflect() UDF must be primitives (and I'm guessing that "collectedSet" is a list). I think this is primarily due to the fact that Hive only supports a limited number of complex types (specifically: list, map, struct, union) and these types are abstracted in a way which makes it difficult to convert to concrete Java types and vice versa (for a good discussion of the abstraction used to represent complex types you may want to refer to the following post: http://www.congiu.com/articles/json_serde). Long story short, you'll probably want to write your own UDF or possibly reuse the existing reflect() UDF to handle your specific case. Matt P.S. You can see the source code for the reflect() UDF on GitHub: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java. The exception you are seeing is generated by the following block of code: if (arguments[i].getCategory() != ObjectInspector.Category.PRIMITIVE) { throw new UDFArgumentTypeException(i, "The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...])" + " must be primitive (int, double, string, etc)."); } At a very high level, I think you would want to remove the for loop where that exception is generated and instead have something like: ListObjectInspector listOI = (ListObjectInspector) arguments[2]; Then replace the following lines: // Get the parameter values for (int i = 2; i < arguments.length; i++) { parameterJavaValues[i - 2] argumentOIs[i].getPrimitiveJavaObject(arguments[i].get()); } with something like: Object test[] = new Object[listOI.getListLength(arguments[2].get())]; for (int i = 0; i < test.length; i++) { test[i] = listOI.getListElement(arguments[2].get(), i); } parameterJavaValues[i - 2] = test I've left out a lot of details and probably missing some "gotchas," but hopefully that helps you get the wheels turning... On Thu, Nov 17, 2011 at 6:41 AM, Tucker, Matt <[EMAIL PROTECTED]>wrote: > I’m running into the same issue, and I see that it’s addressed in > HIVE-2223.**** > > ** ** > > In the meantime, I’m getting an error when trying to use the reflect() > function :**** > > SELECT reflect("org.apache.commons.lang.StringUtils", "join", > collectedSet), …**** > > ** ** > > FAILED: Error in semantic analysis: Line 1:69 Argument type mismatch > collectedSet: The parameters of > GenericUDFReflect(class,method[,arg1[,arg2]...]) must be primitive (int, > double, string, etc).**** > > ** ** > > Matt Tucker**** > > Associate eBusiness Analyst**** > > Walt Disney Parks and Resorts Online**** > > Ph: 407-566-2545**** > > Tie: 8-296-2545**** > > ** ** > -- Matt Martin Think Big Analytics [EMAIL PROTECTED]
+
Matt Martin 2011-11-18, 23:52
-
Re: Converting Array to a String
Miguel Cabero 2011-11-19, 00:37
Hi, The easyest way I found to convert an array into a string in Hive is something like : 1) ALTER TABLE table_name CHANGE COLUMN array_col_old_name string_col_new_name string; 2) SELECT REPLACE(string_col_new_name, '\002', ', ') FROM table_name; Regards, Miguel On 19 Nov 2011, at 00:52, Matt Martin wrote: > As currently implemented, the parameters passed to the reflect() UDF must be primitives (and I'm guessing that "collectedSet" is a list). I think this is primarily due to the fact that Hive only supports a limited number of complex types (specifically: list, map, struct, union) and these types are abstracted in a way which makes it difficult to convert to concrete Java types and vice versa (for a good discussion of the abstraction used to represent complex types you may want to refer to the following post: http://www.congiu.com/articles/json_serde). > > Long story short, you'll probably want to write your own UDF or possibly reuse the existing reflect() UDF to handle your specific case. > > Matt > > P.S. You can see the source code for the reflect() UDF on GitHub: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java. The exception you are seeing is generated by the following block of code: > > if (arguments[i].getCategory() != ObjectInspector.Category.PRIMITIVE) { > > throw new UDFArgumentTypeException(i, > > "The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...])" > > + " must be primitive (int, double, string, etc)."); > > } > > > At a very high level, I think you would want to remove the for loop where that exception is generated and instead have something like: > > ListObjectInspector listOI = (ListObjectInspector) arguments[2]; > > Then replace the following lines: > > > // Get the parameter values > > > for (int i = 2; i < arguments.length; i++) { > > > parameterJavaValues[i - 2] = argumentOIs[i].getPrimitiveJavaObject(arguments[i].get()); > > > } > > with something like: > > Object test[] = new Object[listOI.getListLength(arguments[2].get())]; > for (int i = 0; i < test.length; i++) { > test[i] = listOI.getListElement(arguments[2].get(), i); > } > parameterJavaValues[i - 2] = test > > I've left out a lot of details and probably missing some "gotchas," but hopefully that helps you get the wheels turning... > > On Thu, Nov 17, 2011 at 6:41 AM, Tucker, Matt <[EMAIL PROTECTED]> wrote: > I’m running into the same issue, and I see that it’s addressed in HIVE-2223. > > > > In the meantime, I’m getting an error when trying to use the reflect() function : > > SELECT reflect("org.apache.commons.lang.StringUtils", "join", collectedSet), … > > > > FAILED: Error in semantic analysis: Line 1:69 Argument type mismatch collectedSet: The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...]) must be primitive (int, double, string, etc). > > > > Matt Tucker > > Associate eBusiness Analyst > > Walt Disney Parks and Resorts Online > > Ph: 407-566-2545 > > Tie: 8-296-2545 > > > > > > > -- > Matt Martin > Think Big Analytics > [EMAIL PROTECTED] >
+
Miguel Cabero 2011-11-19, 00:37
-
RE: Converting Array to a String
Tucker, Matt 2011-11-22, 19:12
Thanks Miguel! That did the trick. Now I just need to sort the input to collect_set(), so I can 'GROUP BY' properly. Matt Tucker From: Miguel Cabero [mailto:[EMAIL PROTECTED]] Sent: Friday, November 18, 2011 7:38 PM To: [EMAIL PROTECTED] Subject: Re: Converting Array to a String Hi, The easyest way I found to convert an array into a string in Hive is something like : 1) ALTER TABLE table_name CHANGE COLUMN array_col_old_name string_col_new_name string; 2) SELECT REPLACE(string_col_new_name, '\002', ', ') FROM table_name; Regards, Miguel On 19 Nov 2011, at 00:52, Matt Martin wrote: As currently implemented, the parameters passed to the reflect() UDF must be primitives (and I'm guessing that "collectedSet" is a list). I think this is primarily due to the fact that Hive only supports a limited number of complex types (specifically: list, map, struct, union) and these types are abstracted in a way which makes it difficult to convert to concrete Java types and vice versa (for a good discussion of the abstraction used to represent complex types you may want to refer to the following post: http://www.congiu.com/articles/json_serde). Long story short, you'll probably want to write your own UDF or possibly reuse the existing reflect() UDF to handle your specific case. Matt P.S. You can see the source code for the reflect() UDF on GitHub: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java. The exception you are seeing is generated by the following block of code: if (arguments[i].getCategory() != ObjectInspector.Category.PRIMITIVE) { throw new UDFArgumentTypeException(i, "The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...])" + " must be primitive (int, double, string, etc)."); } At a very high level, I think you would want to remove the for loop where that exception is generated and instead have something like: ListObjectInspector listOI = (ListObjectInspector) arguments[2]; Then replace the following lines: // Get the parameter values for (int i = 2; i < arguments.length; i++) { parameterJavaValues[i - 2] = argumentOIs[i].getPrimitiveJavaObject(arguments[i].get()); } with something like: Object test[] = new Object[listOI.getListLength(arguments[2].get())]; for (int i = 0; i < test.length; i++) { test[i] = listOI.getListElement(arguments[2].get(), i); } parameterJavaValues[i - 2] = test I've left out a lot of details and probably missing some "gotchas," but hopefully that helps you get the wheels turning... On Thu, Nov 17, 2011 at 6:41 AM, Tucker, Matt <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: I'm running into the same issue, and I see that it's addressed in HIVE-2223. In the meantime, I'm getting an error when trying to use the reflect() function : SELECT reflect("org.apache.commons.lang.StringUtils", "join", collectedSet), ... FAILED: Error in semantic analysis: Line 1:69 Argument type mismatch collectedSet: The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...]) must be primitive (int, double, string, etc). Matt Tucker Associate eBusiness Analyst Walt Disney Parks and Resorts Online Ph: 407-566-2545<tel:407-566-2545> Tie: 8-296-2545 -- Matt Martin Think Big Analytics [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
+
Tucker, Matt 2011-11-22, 19:12
-
Re: Converting Array to a String
Mark Grover 2011-11-22, 19:39
Matt and Miguel, You may also be able to dynamically cast the column to string and then do your replace instead of altering the metadata associated with the table. Mark ----- Original Message ----- From: "Matt Tucker" <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tuesday, November 22, 2011 2:12:29 PM Subject: RE: Converting Array to a String Thanks Miguel! That did the trick. Now I just need to sort the input to collect_set(), so I can ‘GROUP BY’ properly. Matt Tucker From: Miguel Cabero [mailto:[EMAIL PROTECTED]] Sent: Friday, November 18, 2011 7:38 PM To: [EMAIL PROTECTED] Subject: Re: Converting Array to a String Hi, The easyest way I found to convert an array into a string in Hive is something like : 1) ALTER TABLE table_name CHANGE COLUMN array_col_old_name string_col_new_name string; 2) SELECT REPLACE(string_col_new_name, '\002', ', ') FROM table_name; Regards, Miguel On 19 Nov 2011, at 00:52, Matt Martin wrote: As currently implemented, the parameters passed to the reflect() UDF must be primitives (and I'm guessing that "collectedSet" is a list). I think this is primarily due to the fact that Hive only supports a limited number of complex types (specifically: list, map, struct, union) and these types are abstracted in a way which makes it difficult to convert to concrete Java types and vice versa (for a good discussion of the abstraction used to represent complex types you may want to refer to the following post: http://www.congiu.com/articles/json_serde ). Long story short, you'll probably want to write your own UDF or possibly reuse the existing reflect() UDF to handle your specific case. Matt P.S. You can see the source code for the reflect() UDF on GitHub: https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java . The exception you are seeing is generated by the following block of code: if ( arguments [ i ]. getCategory () != ObjectInspector . Category . PRIMITIVE ) { throw new UDFArgumentTypeException ( i , "The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...])" + " must be primitive (int, double, string, etc)." ); } At a very high level, I think you would want to remove the for loop where that exception is generated and instead have something like: ListObjectInspector listOI = (ListObjectInspector) arguments[2]; Then replace the following lines: // Get the parameter values for ( int i = 2 ; i < arguments . length ; i ++) { parameterJavaValues [ i - 2 ] = argumentOIs [ i ]. getPrimitiveJavaObject ( arguments [ i ]. get ()); } with something like: Object test[] = new Object[listOI.getListLength(arguments[2].get())]; for (int i = 0; i < test.length; i++) { test[i] = listOI.getListElement(arguments[2].get(), i); } parameterJavaValues[i - 2] = test I've left out a lot of details and probably missing some "gotchas," but hopefully that helps you get the wheels turning... On Thu, Nov 17, 2011 at 6:41 AM, Tucker, Matt < [EMAIL PROTECTED] > wrote: I’m running into the same issue, and I see that it’s addressed in HIVE-2223. In the meantime, I’m getting an error when trying to use the reflect() function : SELECT reflect("org.apache.commons.lang.StringUtils", "join", collectedSet), … FAILED: Error in semantic analysis: Line 1:69 Argument type mismatch collectedSet: The parameters of GenericUDFReflect(class,method[,arg1[,arg2]...]) must be primitive (int, double, string, etc). Matt Tucker Associate eBusiness Analyst Walt Disney Parks and Resorts Online Ph: 407-566-2545 Tie: 8-296-2545 -- Matt Martin Think Big Analytics [EMAIL PROTECTED]
+
Mark Grover 2011-11-22, 19:39
|
|