Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Error in Top() UDF?


Copy link to this message
-
RE: Error in Top() UDF?
Dmitry,

This looks like a bug. The root cause of this issue is in the
getArgToFuncMapping()

http://svn.apache.org/viewvc/hadoop/pig/trunk/contrib/piggybank/java/src
/main/java/org/apache/pig/piggybank/evaluation/util/Top.java?annotate=77
0445

A possible work around for this issue: Comment out the
getArgToFuncMapping() in Top.java, rebuild piggybank and use the new
jar. I have not tested it but that's the first thing I would do.

For the actual fix, the implementation of getArgToFuncMapping() has to
change to return a single element list which maps Top.java -> the
function argument schema. Here, the schema should be an (int, int, bag)

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 16, 2009 7:20 AM
To: [EMAIL PROTECTED]
Subject: Error in Top() UDF?

In trying to use the Top() function from the piggybank (PIG-732), I
consistently get the error "could not infer matching function".
This happens even when I write a simple script that just goes through
the
steps described in the Top() Javadoc.

Thoughts?

grunt> register piggybank.jar
grunt> data = load 'top-data.dat' using PigStorage(',') as (grp, data);

-- what is the source data?
grunt> dump data;
(big,data)
(big,data2)
(big,data3)
(small,data)

 -- go through the steps in the Top() Javadoc
grunt> grouped = group data by grp;
grunt> counted = FOREACH grouped GENERATE FLATTEN(group), COUNT(data) as
cnt;
grunt> regrouped = GROUP counted BY group;

grunt> dump regrouped;
(big,{(big,3L)})
(small,{(small,1L)})

grunt> describe regrouped;
regrouped: {group: bytearray,counted: {group: bytearray,cnt: long}}

-- now, let's try that Top() function

grunt> topres = FOREACH regrouped {
          res org.apache.pig.piggybank.evaluation.util.Top(1,1,counted);
          GENERATE FLATTEN (res);
}
grunt> dump topres;
2009-06-16 07:11:56,867 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1045: Could not infer the matching function for
org.apache.pig.piggybank.evaluation.util.Top as multiple or none of them
fit. Please use an explicit cast.