Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - UDF to calculate Average of whole dataset


Copy link to this message
-
Re: UDF to calculate Average of whole dataset
inelu nagamallikarjuna 2013-03-05, 22:12
Hi,

Use the fully qualified class name like org.apache.udf.myudf.udfName in the
pig script while using udf.
Otherwise use only udf name in the script and while running use like pig -
Dudf.import.list=org.apache.udf.myudf.evaluation.string scriptname.pig
Thanks
Nagamallikarjuna

On Wed, Mar 6, 2013 at 2:54 AM, Preeti Gupta <[EMAIL PROTECTED]>wrote:

> Nope. It does not work
>
> 2013-03-05 13:22:28,768 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1070: Could not resolve myudf.CalculateAvg using imports: [,
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> Details at logfile:
> /Users/PreetiGupta/Documents/CMPS290S/project/pig_1362518535200.log
> ~
>
> Pig script
>
> REGISTER ./myudfs.jar;
> dividends = load 'myfile' as (A);
> dump dividends
> --grouped   = filter dividends by A>-10000000.0;
> --avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
> avg = foreach (group dividends all) generate myudf.CalculateAvg(dividends);
> dump avg
>
> My jar file
>
> bash-3.2# vi a.txt
>
>      0 Mon Mar 04 13:45:44 PST 2013 META-INF/
>     60 Mon Mar 04 13:45:44 PST 2013 META-INF/MANIFEST.MF
>   1190 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Final.class
>   1306 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Initial.class
>   1477 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Intermediate.class
>   4205 Mon Mar 04 13:45:16 PST 2013 CalculateAvg.class
> ~
>
> On Mar 5, 2013, at 1:09 PM, pablomar <[EMAIL PROTECTED]>
> wrote:
>
> > did you try with {jarFileName}.{FunctionName} ?
> > example: myudfs.CalculateAvg ?
> >
> >
> > On Tue, Mar 5, 2013 at 4:04 PM, Preeti Gupta <[EMAIL PROTECTED]
> >wrote:
> >
> >> I kept the code in myudfs.jar and my pig script is point to it using
> >> register command but the script is not able to find CalculateAvg
> function.
> >> I don't have any packages defined in the java file and the jar is my
> >> current directory.
> >>
> >>
> >> On Mar 5, 2013, at 3:17 AM, Jonathan Coveney <[EMAIL PROTECTED]>
> wrote:
> >>
> >>> dividends = load 'try.txt'
> >>> a = foreach dividends generate FLATTEN(TOBAG(*));
> >>> b = foreach (group a all) generate CalculateAvg($1);
> >>>
> >>> I think that should work
> >>>
> >>>
> >>> 2013/3/5 pablomar <[EMAIL PROTECTED]>
> >>>
> >>>> what is the error ?
> >>>> function not found or something like that ?
> >>>>
> >>>> what about this ?
> >>>> avg       = generate myudfs.CalculateAvg(dividends);
> >>>>
> >>>>
> >>>> On Mon, Mar 4, 2013 at 4:56 PM, Preeti Gupta <
> >> [EMAIL PROTECTED]
> >>>>> wrote:
> >>>>
> >>>>> Hello All,
> >>>>>
> >>>>> I have dataset like
> >>>>>
> >>>>> 0, 10.1, 20.1, 30, 40,
> >>>>> 50, 60, 70, 80.1, 1,
> >>>>> 2, 3, 4, 5, 6,
> >>>>> 7, 8, 9, 10, 11,
> >>>>> 12, 13, 14, 15, 16,
> >>>>> 1, 2, 3, 4, 5,
> >>>>> 56, 6, 7, 8, 9,
> >>>>> 9, 9, 9, 12, 1,
> >>>>> 3, 14, 1, 5, 6,
> >>>>> 7, 8, 8, 9, 12
> >>>>>
> >>>>> So basically comma separated values. But I want to consider this as
> one
> >>>>> data column and I want to calculate the average of the whole dataset.
> >>>>>
> >>>>> I believe I have to write UDF to calculate average. Pig is able to
> load
> >>>>> this data
> >>>>>
> >>>>> (  0, 10.1, 20.1, 30, 40,)
> >>>>> (  50, 60, 70, 80.1, 1,)
> >>>>> (  2, 3, 4, 5, 6,)
> >>>>> (  7, 8, 9, 10, 11,)
> >>>>> (  12, 13, 14, 15, 16,)
> >>>>> (  1, 2, 3, 4, 5,)
> >>>>> (  56, 6, 7, 8, 9,)
> >>>>> (  9, 9, 9, 12, 1,)
> >>>>> (  3, 14, 1, 5, 6,)
> >>>>> (  7, 8, 8, 9, 12 )
> >>>>>
> >>>>> and How do I invoke that UDF in my pig script? Say I implement
> >>>>> CalculateAvg function.
> >>>>>
> >>>>> REGISTER ./myudfs.jar
> >>>>> dividends = load 'try.txt';
> >>>>> dump dividends
> >>>>> --grouped   = group dividends by symbol;
> >>>>> avg       = generate CalculateAvg(dividends);
> >>>>> dump avg
> >>>>> --store avg into 'average_dividend';
> >>>>>
> >>>>> It fails.
> >>>>>
> >>>>>
> >>>>
> >>
> >>
>
>
--
Thanks and Regards
Nagamallikarjuna