Sergey Goder 2013-05-03, 18:20
Kai Londenberg 2013-05-03, 18:42
Sergey Goder 2013-05-03, 20:36
As for the PRODUCT, I don't see why it could not be added to builtin.
It is a very generic and dependency less function.
On Fri, May 3, 2013 at 1:36 PM, Sergey Goder <[EMAIL PROTECTED]> wrote:
> Thanks for the tip about numerical accuracy issues and the elegant solution
> exploiting log/exp. It is very much appreciated.
> On Fri, May 3, 2013 at 11:42 AM, Kai Londenberg <
> [EMAIL PROTECTED]> wrote:
> > Hi,
> > Just a hint: It's usually better to work with log probabilites and sum
> > over them, than to work with raw probabilities and to use
> > multiplication. You might easily run into numerical accuracy issues
> > otherwise.
> > i.e. exploit this fact:
> > product(x1, ..., xn) = exp(sum(log(x1), ..., log(xn)))
> > best,
> > Kai Londenberg
> > 2013/5/3 Sergey Goder <[EMAIL PROTECTED]>:
> > > I'm creating a multinomial naive bayes classifier using pig and need to
> > > compute the product of probabilities. There are an arbitrary number of
> > > values in the bag so I would like to be able to use a function similar
> > > the builtin SUM to do this. I looked through the source code and found
> > that
> > > with some really simple changes to SUM.java I can create a PROD.java
> > > function. I included it in my piggybank and have been using it
> > successfully.
> > >
> > > I was curious what the community thought about including this function
> > as a
> > > builtin function in a future release? Or would it make more sense to
> > > this function as a udf in a piggybank.
> > >
> > > Thanks,
> > > Sergey