Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> when Algebraic UDF are used ?


Copy link to this message
-
Re: when Algebraic UDF are used ?
It can't use the algebraic interface in this case because the data has to be sorted (which means it has to see all the data) before passing it to your UDF.  If you remove the ORDER statement then the algebraic portion of your UDF will be invoked.

Alan.

On Jul 25, 2012, at 9:32 AM, Benoit Mathieu wrote:

> Hi pig users,
>
> I have coded my own algebraic UDF in Java, and it seems that pig do not use
> the algebraic interface at all. (I put some log messages in my
> Initial,Intermed and Final functions, and they re never logged).
> Pig uses only the main "exec" function.
>
> My UDF needs to get the bag sorted.
> Here is my pig script:
>
> A = LOAD '...' USING PigStorage() AS (k1:int,k2:int,value:int);
> B = GROUP A BY k1;
> C = FOREACH B {
>  tmp = ORDER A.(k2,value) BY k2;
>  GENERATE group, MyUDF(tmp);
> }
> ...
>
>
> Does anyone know why pig does not use the algebraic interface ?
>
> thanks,
>
> Benoit