Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - UDF to calculate Average of whole dataset


Copy link to this message
-
Re: UDF to calculate Average of whole dataset
inelu nagamallikarjuna 2013-03-05, 22:49
Hi,

I am providing sample UDF and how to use it in pig script.

*JAVA CLASS:

package myudf.udf.upper;

public class UPPER extends EvalFunc<String>
{
        logic to convert all the tokens into Upper case ones.
}*

*input data:*
naga
siva
ravi

*Pig Script*

*-- Always use absolute path of the udf jar location
register /home/naga/bigdata/pig-0.10.0/upper.jar
data = load '/data/names/' using PigStorage() as (name: chararray);
names = foreach data generate **myudf.udf.upper.UPPER(name);
dump names;

output:*

org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
2013-03-06 04:08:14,017 [main] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths
to process : 1
2013-03-06 04:08:14,018 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
*(NAGA)
(SIVA)
(RAVI)*
Thanks
Nagamallikarjuna
On Wed, Mar 6, 2013 at 3:42 AM, inelu nagamallikarjuna
<[EMAIL PROTECTED]>wrote:

> Hi,
>
> Use the fully qualified class name like org.apache.udf.myudf.udfName in
> the pig script while using udf.
> Otherwise use only udf name in the script and while running use like pig -
> Dudf.import.list=org.apache.udf.myudf.evaluation.string scriptname.pig
>
>
> Thanks
> Nagamallikarjuna
>
>
> On Wed, Mar 6, 2013 at 2:54 AM, Preeti Gupta <[EMAIL PROTECTED]>wrote:
>
>> Nope. It does not work
>>
>> 2013-03-05 13:22:28,768 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1070: Could not resolve myudf.CalculateAvg using imports: [,
>> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>> Details at logfile:
>> /Users/PreetiGupta/Documents/CMPS290S/project/pig_1362518535200.log
>> ~
>>
>> Pig script
>>
>> REGISTER ./myudfs.jar;
>> dividends = load 'myfile' as (A);
>> dump dividends
>> --grouped   = filter dividends by A>-10000000.0;
>> --avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);
>> avg = foreach (group dividends all) generate
>> myudf.CalculateAvg(dividends);
>> dump avg
>>
>> My jar file
>>
>> bash-3.2# vi a.txt
>>
>>      0 Mon Mar 04 13:45:44 PST 2013 META-INF/
>>     60 Mon Mar 04 13:45:44 PST 2013 META-INF/MANIFEST.MF
>>   1190 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Final.class
>>   1306 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Initial.class
>>   1477 Mon Mar 04 13:45:16 PST 2013 CalculateAvg$Intermediate.class
>>   4205 Mon Mar 04 13:45:16 PST 2013 CalculateAvg.class
>> ~
>>
>> On Mar 5, 2013, at 1:09 PM, pablomar <[EMAIL PROTECTED]>
>> wrote:
>>
>> > did you try with {jarFileName}.{FunctionName} ?
>> > example: myudfs.CalculateAvg ?
>> >
>> >
>> > On Tue, Mar 5, 2013 at 4:04 PM, Preeti Gupta <[EMAIL PROTECTED]
>> >wrote:
>> >
>> >> I kept the code in myudfs.jar and my pig script is point to it using
>> >> register command but the script is not able to find CalculateAvg
>> function.
>> >> I don't have any packages defined in the java file and the jar is my
>> >> current directory.
>> >>
>> >>
>> >> On Mar 5, 2013, at 3:17 AM, Jonathan Coveney <[EMAIL PROTECTED]>
>> wrote:
>> >>
>> >>> dividends = load 'try.txt'
>> >>> a = foreach dividends generate FLATTEN(TOBAG(*));
>> >>> b = foreach (group a all) generate CalculateAvg($1);
>> >>>
>> >>> I think that should work
>> >>>
>> >>>
>> >>> 2013/3/5 pablomar <[EMAIL PROTECTED]>
>> >>>
>> >>>> what is the error ?
>> >>>> function not found or something like that ?
>> >>>>
>> >>>> what about this ?
>> >>>> avg       = generate myudfs.CalculateAvg(dividends);
>> >>>>
>> >>>>
>> >>>> On Mon, Mar 4, 2013 at 4:56 PM, Preeti Gupta <
>> >> [EMAIL PROTECTED]
>> >>>>> wrote:
>> >>>>
>> >>>>> Hello All,
>> >>>>>
>> >>>>> I have dataset like
>> >>>>>
>> >>>>> 0, 10.1, 20.1, 30, 40,
>> >>>>> 50, 60, 70, 80.1, 1,
>> >>>>> 2, 3, 4, 5, 6,
>> >>>>> 7, 8, 9, 10, 11,
>> >>>>> 12, 13, 14, 15, 16,
>> >>>>> 1, 2, 3, 4, 5,
>> >>>>> 56, 6, 7, 8, 9,
>> >>>>> 9, 9, 9, 12, 1,
>> >>>>> 3, 14, 1, 5, 6,
>> >>>>> 7, 8, 8, 9, 12
>
Thanks and Regards
Nagamallikarjuna