Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> A UDF that is both Algebraic and Accumulator

Copy link to this message
Re: A UDF that is both Algebraic and Accumulator
It uses the best one it can. Algebraic is generally better than
Accumulator, and if it can use Algebraic it will. If it can't use either,
it will use the default EvalFunc.

In Pig, there aren't too many cases where an Algebraic/Accumulator EvalFunc
will have to be evaluated as an Accumulator...in isolation. But if you mix
an Algebraic EvalFunc an an Accumulator EvalFunc in the same map/reduce
portion of the Pig job, I believe that all of the EvalFuncs will be
executed as Accumulators.

There is also AlgebraicEvalFunc which will give you an Accumulator and
EvalFunc implementation for free for your Algebraic.
2013/6/4 Ahmed Eldawy <[EMAIL PROTECTED]>

> I wrote a function that is both Algebraic and Accumulator. I tested it on a
> small dataset and it used only the algebraic interface. When I removed the
> statements "implemented Algebraic" and left only the accumulator interface,
> it called it. So, I need to know how it decides which one to use.
> Best regards,
> Ahmed Eldawy
> On Tue, Jun 4, 2013 at 1:46 PM, Mehmet Tepedelenlioglu <
> > wrote:
> > It uses both. They are not contradictory.
> >
> >
> > ________________________________
> >  From: Ahmed Eldawy <[EMAIL PROTECTED]>
> > Sent: Tuesday, June 4, 2013 11:31 AM
> > Subject: A UDF that is both Algebraic and Accumulator
> >
> >
> > In the Apache Pig documentation, it is mentioned that we can define a UDF
> > as both Algebraic and Accumulator
> > http://pig.apache.org/docs/r0.11.1/udf.html#accumulator-interface
> > If I do such a thing, how Pig decides which interface of them to use. I
> > assume they are completely separate and they cannot be mixed with each
> > other. Is there a way to enforce Pig to prefer one of them over the
> other?
> >
> > Best regards,
> > Ahmed Eldawy
> >