Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Help to solve UDAF errors!


Copy link to this message
-
Re: Help to solve UDAF errors!
Hi Mark,

Sorry for the preliminary questions since I am a beginner for Hive.
I have read in both books (Hadoop Definitive Guide and Programming Hive)
that we need to implement the functions: init(), iterate(),
terminatePartial(), merge() and terminate for extending a UDAF and
UDAFEvaluator class.
Even I implemented a group sum function before with the above methods and
it worked fine.
But the link you sent asks to implement init(), aggregate() and evaluate()
which I find it completely new since I used evaluate() for UDF.
Is this some new version of Hive?
The problem here is, I am not able to return a ArrayList of doubles from
the final terminate() function but I have seen other working UDAF's where
the ArrayList of different types can be returned.
So how do I return a ArrayList of Doubles?

Thanks,
Abhishek
On Sun, Feb 10, 2013 at 12:36 PM, Mark Grover
<[EMAIL PROTECTED]>wrote:

> Hi Abhishek,
> The code looks incomplete.
>
> See the comment at
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UDAF.java#L22
> Those are all the methods your UDAF class needs to implement but you seem
> to be missing them.
>
> Mark
>
> On Sat, Feb 9, 2013 at 11:08 PM, Abhishek Bhattacharya <[EMAIL PROTECTED]>wrote:
>
>> Thanks for the response.
>> The link to the code is:
>> https://github.com/Abhishek2301/Hive/blob/master/src/UDAFTopNPercent.java
>> Please let me know to fix it!
>>
>> Thanks,
>> Abhishek
>>
>>
>>
>> On Fri, Feb 8, 2013 at 5:02 PM, Mark Grover <[EMAIL PROTECTED]>wrote:
>>
>>> Abhishek,
>>> The code doesn't seem to be complete.
>>>
>>> Look at
>>> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDAFPercentile.javafor reference. It has two terminate()'s - one for UDAF and one for the
>>> Evaluator.
>>>
>>> Do you mind posting your complete code on github somewhere so it's
>>> easier to analyze?
>>>
>>> Mark
>>>
>>> On Fri, Feb 8, 2013 at 2:05 PM, Abhishek Bhattacharya <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi,
>>>>
>>>> I have implemented a simple UDAF for top-n-percent as follows:
>>>> import java.util.ArrayList;
>>>> import java.util.Collections;
>>>>
>>>> import org.apache.hadoop.hive.ql.exec.UDAF;
>>>> import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;
>>>>
>>>> public class UDAFTopNPercent extends UDAF{
>>>>
>>>>     public static class Result {
>>>>         ArrayList<Double> list;
>>>>         double min;
>>>>     }
>>>>
>>>>     public class TopNPercentEvaluator implements UDAFEvaluator {
>>>>
>>>>         private Result res;
>>>>         private int rowIndex;
>>>>         private int percent;
>>>>
>>>>         public TopNPercentEvaluator() {
>>>>             super();
>>>>             res = new Result();
>>>>             init();
>>>>             rowIndex = 0;
>>>>         }
>>>>         @Override
>>>>         public void init() {
>>>>             res.list = new ArrayList<Double>();
>>>>             res.min = Double.MAX_VALUE;
>>>>         }
>>>>
>>>>         public boolean iterate(Double rowVal, int pct) {
>>>>             ArrayList<Double> resList = res.list;
>>>>             rowIndex++;
>>>>             resList.add(rowVal);
>>>>             percent = pct;
>>>>             return true;
>>>>         }
>>>>
>>>>         public ArrayList<Double> terminatePartial() {
>>>>             ArrayList<Double> resList = res.list;
>>>>             Collections.sort(resList);
>>>>             return resList;
>>>>         }
>>>>
>>>>         public boolean merge(ArrayList<Double> otherList) {
>>>>             ArrayList<Double> resList = res.list;
>>>>             resList.addAll(otherList);
>>>>             return true;
>>>>         }
>>>>
>>>>         public ArrayList<Double> terminate() {
>>>>             ArrayList<Double> resList = res.list;
>>>>             double num_rows = (double)percent/100.0*rowIndex;
>>>>             Collections.sort(resList);
>>>>             int lastIdx = resList.size()- (int) num_rows;
>>>>             if(lastIdx <= 0) {
Thanks and Regards,

Abhishek Bhattacharya
PhD Computer Science
School of Computing and Information Sciences
Florida International University
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB