Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Help to solve UDAF errors!


Copy link to this message
-
Help to solve UDAF errors!
Hi,

I have implemented a simple UDAF for top-n-percent as follows:
import java.util.ArrayList;
import java.util.Collections;

import org.apache.hadoop.hive.ql.exec.UDAF;
import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;

public class UDAFTopNPercent extends UDAF{

    public static class Result {
        ArrayList<Double> list;
        double min;
    }

    public class TopNPercentEvaluator implements UDAFEvaluator {

        private Result res;
        private int rowIndex;
        private int percent;

        public TopNPercentEvaluator() {
            super();
            res = new Result();
            init();
            rowIndex = 0;
        }
        @Override
        public void init() {
            res.list = new ArrayList<Double>();
            res.min = Double.MAX_VALUE;
        }

        public boolean iterate(Double rowVal, int pct) {
            ArrayList<Double> resList = res.list;
            rowIndex++;
            resList.add(rowVal);
            percent = pct;
            return true;
        }

        public ArrayList<Double> terminatePartial() {
            ArrayList<Double> resList = res.list;
            Collections.sort(resList);
            return resList;
        }

        public boolean merge(ArrayList<Double> otherList) {
            ArrayList<Double> resList = res.list;
            resList.addAll(otherList);
            return true;
        }

        public ArrayList<Double> terminate() {
            ArrayList<Double> resList = res.list;
            double num_rows = (double)percent/100.0*rowIndex;
            Collections.sort(resList);
            int lastIdx = resList.size()- (int) num_rows;
            if(lastIdx <= 0) {
                return resList;
            }
            for(int i=0; i<lastIdx; i++) {
                resList.remove(i);
            }
            return resList;
        }
    }

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub

    }

}

But throws some error such as first few lines are:
FAILED: Hive Internal Error:
java.lang.ClassCastException(org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector
cannot be cast to
org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)
java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector
cannot be cast to
org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector
        at
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:116)
        at
org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:300)
        at
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.init(GenericUDAFBridge.java:129)

Please help me to debug this!
Is it throwing from returning ArrayList<Double> in terminate()?
How should I return a List from UDAF?

Thanks,
Abhishek
+
Mark Grover 2013-02-08, 23:02
+
Abhishek Bhattacharya 2013-02-10, 07:08
+
Mark Grover 2013-02-10, 18:36
+
Abhishek Bhattacharya 2013-02-11, 03:47
+
Abhishek Bhattacharya 2013-02-12, 17:48
+
Robin Morris 2013-02-14, 01:02
+
Abhishek Bhattacharya 2013-02-14, 01:20
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB