Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> UDTF fails when used in LATERAL VIEW

Copy link to this message
UDTF fails when used in LATERAL VIEW

I've hit problems when writing custom UDTF that should return string
values. I couldn't find anywhere what type should have the values that
get forward()ed to collector. The only info I could dig out from
google was few blogs with examples and 4 UDTFs that are among the hive
sources. From that I figured out, that it should be OK to simply pass
Strings inside the forwarded Object[] array. Here are the relevant
parts of my code:

      private Object[] forwardListObj;

      public StructObjectInspector initialize(ObjectInspector[] args)
throws UDFArgumentException {

        // snipped irrelevant code

        forwardListObj = new Object[1];
        forwardListObj[0] = new String();

        ArrayList<String> fieldNames = new ArrayList<String>(1);
        ArrayList<ObjectInspector> fieldOIs = new ArrayList<ObjectInspector>(1);


        return ObjectInspectorFactory.getStandardStructObjectInspector(fieldNames,

In proces() there is simple forwarding of some String:

      forwardListObj[0] = "";
      // OR
      String s = ...
      forwardListObj[0] = s;
I was testing the function with a simple query

SELECT my_func(arg) AS x FROM logs WHERE (dt=2011120104);

and it worked just as intended. But at the moment I got from testing
to actually using the function in more complex queries, I got into
trouble. Even LATERAL VIEW statement can cause failures:

SELECT x FROM logs LATERAL VIEW my_func(arg) t AS x WHERE (dt=2011120104);

causes tasks to fail with exception

java.lang.ClassCastException: java.lang.String cannot be cast to
at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.getPrimitiveJavaObject(WritableStringObjectInspector.java:45)
at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:607)
at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorConverter$DoubleConverter.convert(PrimitiveObjectInspectorConverter.java:229)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqual.evaluate(GenericUDFOPEqual.java:73)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:86)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator$DeferredExprObject.get(ExprNodeGenericFuncEvaluator.java:56)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd.evaluate(GenericUDFOPAnd.java:52)
at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.evaluate(ExprNodeGenericFuncEvaluator.java:86)
at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:83)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
at org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112)
at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44)
at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81)
at cz.seznam.im.functions.ExplodeSection.process(ExplodeSection.java:103)

I should also mention that I use custom SerDe and InputFormat for the
'logs' table. When I was trying to figure it out, I was trying to run
the same queries as listed above on different table without the
customizations and it worked correctly too. So I think the SerDe
and/or InputFormat probably play some role in this as well. What I
don't understand is why the problem exhibits itself only with LATERAL
VIEW. Any ideas anyone? Also, is it really correct to send String in

Best regards,