Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Error in Top() UDF?


Copy link to this message
-
RE: Error in Top() UDF?
Dmitry,

This looks like a bug. The root cause of this issue is in the
getArgToFuncMapping()

http://svn.apache.org/viewvc/hadoop/pig/trunk/contrib/piggybank/java/src
/main/java/org/apache/pig/piggybank/evaluation/util/Top.java?annotate=77
0445

A possible work around for this issue: Comment out the
getArgToFuncMapping() in Top.java, rebuild piggybank and use the new
jar. I have not tested it but that's the first thing I would do.

For the actual fix, the implementation of getArgToFuncMapping() has to
change to return a single element list which maps Top.java -> the
function argument schema. Here, the schema should be an (int, int, bag)

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, June 16, 2009 7:20 AM
To: [EMAIL PROTECTED]
Subject: Error in Top() UDF?

In trying to use the Top() function from the piggybank (PIG-732), I
consistently get the error "could not infer matching function".
This happens even when I write a simple script that just goes through
the
steps described in the Top() Javadoc.

Thoughts?

grunt> register piggybank.jar
grunt> data = load 'top-data.dat' using PigStorage(',') as (grp, data);

-- what is the source data?
grunt> dump data;
(big,data)
(big,data2)
(big,data3)
(small,data)

 -- go through the steps in the Top() Javadoc
grunt> grouped = group data by grp;
grunt> counted = FOREACH grouped GENERATE FLATTEN(group), COUNT(data) as
cnt;
grunt> regrouped = GROUP counted BY group;

grunt> dump regrouped;
(big,{(big,3L)})
(small,{(small,1L)})

grunt> describe regrouped;
regrouped: {group: bytearray,counted: {group: bytearray,cnt: long}}

-- now, let's try that Top() function

grunt> topres = FOREACH regrouped {
          res org.apache.pig.piggybank.evaluation.util.Top(1,1,counted);
          GENERATE FLATTEN (res);
}
grunt> dump topres;
2009-06-16 07:11:56,867 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1045: Could not infer the matching function for
org.apache.pig.piggybank.evaluation.util.Top as multiple or none of them
fit. Please use an explicit cast.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB