Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Hive UDAF convert problem


Copy link to this message
-
Re: Hive UDAF convert problem
Mark Grover 2012-11-15, 07:01
Hi Cheng,
It's reflection that's causing you the problem. You seem to be using the
old class (UDAF) to implement UDAF. While that may still be fine, just so
that you know there is a newer, better performing method to implement UDAFs
(more on that at https://cwiki.apache.org/Hive/genericudafcasestudy.html)
that doesn't require the use of reflection. I would recommend using
creating a GenericUDAF over the old method if you are creating a a new UDAF

I don't have the entire code so it's hard for me to say but does it help if
you change:
 private TreeMap<Long, Session> sessionMap =...
to be
 private Map<Long, Session> sessionMap =...

That would be a good programming practice anyways.

Mark

On Mon, Nov 12, 2012 at 8:01 PM, Cheng Su <[EMAIL PROTECTED]> wrote:

> Hi all.
>
> I'm writing a hive UDAF to calculate page view per session. Java source is
> blow:
>
> ----
> public class CalculateAvgPVPerSession extends UDAF {
>
>         /**
>          * @author Cheng Su
>          *
>          */
>         public static class CountSessionUDAFEvaluator implements
> UDAFEvaluator {
>
>                 private VisitSessions visitSessions = new VisitSessions();
>
>                 /* (non-Javadoc)
>                  * @see org.apache.hadoop.hive.ql.exec.UDAFEvaluator#init()
>                  */
>                 @Override
>                 public void init() {
>                         // do nothing
>                 }
>
>                 public boolean iterate(Text value) {
>                         visitSessions.append(value.toString());
>                         return true;
>                 }
>
>                 public VisitSessions terminatePartial() {
>                         return visitSessions;
>                 }
>
>
>                 public boolean merge(VisitSessions other) {
>                         visitSessions.merge(other);
>                         return true;
>                 }
>
>
>                 public FloatWritable terminate() {
>                         return new
> FloatWritable(visitSessions.getAveragePVPerSession());
>                 }
>         }
>
> }
> ----
>
> VisitSessions is a class which contains a private field
> java.util.TreeMap. Source is blow
>
> ----
>
> public class VisitSessions {
>
>         private static final DateFormat dateFormat = new
> SimpleDateFormat("yyyyMMddHHmmss");
>
>         private final long interval;
>
>         private static final class Session {
>                 private long start;
>                 private long end;
>
>                 long getPeriod() {
>                         return end - start;
>                 }
>         }
>
>         private long pageView = 0L;
>
>         private TreeMap<Long, Session> sessionMap = Maps.newTreeMap();
>
>         // ... do sth ...
>
>        public void merge(VisitSessions other) {
>                 for (final Entry<Long, Session> otherSessionEntry :
> other.sessionMap.entrySet()) {
>                         mergeOne(otherSessionEntry.getValue());
>                 }
>                 pageView += other.pageView;
>         }
>
>
> }
>
> ----
> When I use this UDAF, I get this exception :
> ----
> java.lang.RuntimeException:
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
> while processing row (tag=0)
>
> {"key":{"_col0":0,"_col1":2011,"_col2":10},"value":{"_col0":{"interval":1800000,"pageview":8957,"sessionmap":{1319818373000:{"start":1319818373000,"end":1319818731000},1319821763000:{"start":1319821763000,"end":1319824141000},1319858388000:{"start":1319858388000,"end":1319865262000}}},"_col1":{"interval":1800000,"pageview":8957,"sessionmap":{1319818373000:{"start":1319818373000,"end":1319818731000},1319821763000:{"start":1319821763000,"end":1319824141000},1319858388000:{"start":1319858388000,"end":1319865262000}}}},"alias":0}
>         at
> org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268)
>         at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)