Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Sending parameters to a customer load function


+
Walker, Alan 2012-04-06, 20:38
+
Dmitriy Ryaboy 2012-04-06, 23:21
+
Walker, Alan 2012-04-09, 12:34
+
Dmitriy Ryaboy 2012-04-09, 15:35
Copy link to this message
-
Re: Sending parameters to a customer load function
It seems to me that Alan is only interested in writing a loader which has a
non-default constructor (takes arguments), he doesn't need to create a UDF
which has this property.

Besides SimpleTextLoader, there are a number of examples of this in the Pig
codebase, including HBaseStorage.  My own attempt of a basic no-op loader
that has a non-default constructor also worked fine.

Alan -- can you share the full loader implementation where you're seeing
this issue?  Also, what version of Pig are you using?  Error #2999 makes we
wonder whether you're hitting an uncaught exception elsewhere in your
loader implementation.

Norbert

On Mon, Apr 9, 2012 at 11:35 AM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:

> Hi Alan,
> when you use a loader:
> A  = load 'stuff' using my.pig.Loader('foo', 'bar');
>
> the loader gets constructed with 'foo', 'bar'', then it gets set up
> (with the various setSignature, prepareToRead, etc, calls), and its
> getNext() gets called repeatedly until there is nothing left to read.
>
> when you use a udf:
> B = foreach A generate my.pig.UDF($0);
>
> pig iterates through relation A and invokes the UDF's exec method on a
> tuple composed of the fields specified in your script -- in this case,
> the first field in each row of A.  If you want to have a non-default
> constructor used to create the UDF instance that will be exec'd on all
> these tuples, you can do this through a "define" call as I described
> earlier.
>
> Loaders (and Storers) are very different from UDFs in how they are
> used and invoked, and they implement totally different interfaces.
>
> -Dmitriy
>
> On Mon, Apr 9, 2012 at 5:34 AM, Walker, Alan <[EMAIL PROTECTED]>
> wrote:
> > Dmitriy,
> >
> > I have also tried that pattern for a Loader and it doesn't find the
> String constructor, it only works with the void constructor.
> >
> > grunt> define myreader com.sabre.pigshop.ShoppingReader('all');
> > grunt> A = LOAD '/user/alanw/*.xml' USING myreader() AS (x);
> > 2012-04-09 07:33:58,502 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2999: Unexpected internal error. could not instantiate
> 'com.sabre.pigshop.ShoppingReader' with arguments '[all]'
> >
> >
> > This works:
> >
> > grunt> define myreader com.sabre.pigshop.ShoppingReader();
> > grunt> A = LOAD '/user/alanw/*.xml' USING myreader AS (x);
> >
> >
> > I haven't dug into the Pig source yet, perhaps the Loader functions are
> treated differently than another UDF?  Seems unlikely.
> >
> > Thanks,
> > Alan
> >
> >
> > -----Original Message-----
> > From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
> > Sent: Friday, April 06, 2012 6:21 PM
> > To: [EMAIL PROTECTED]; Walker, Alan
> > Subject: Re: Sending parameters to a customer load function
> >
> > Hi Alan,
> > You can use "define" to supply an argument to a UDF constructor.
> >
> > You can see an example here:
> >
> http://ofps.oreilly.com/titles/9781449302641/intro_pig_latin.html#udf_define
> >
> > I did just find to my surprise that this isn't in our documentation..
> > we should add that.
> >
> > D
> >
> > On Fri, Apr 6, 2012 at 1:38 PM, Walker, Alan <[EMAIL PROTECTED]>
> wrote:
> >> Hi,
> >>
> >> I'm having some challenges with a  load function.  It only seems to
> work with a void constructor.  The Java code has a void constructor and a
> String constructor, much like the SimpleTextLoader example.  Any thoughts
> on what might be going wrong?
> >>
> >>    public ShoppingReader() {
> >>        parms = "";
> >>    }
> >>
> >>    public ShoppingReader(String tmp) {
> >>        parms = tmp;
> >>    }
> >>
> >> grunt> A = LOAD '/user/alanw/*.xml' USING
> com.sabre.pigshop.ShoppingReader('all') AS (x);
> >> 2012-04-06 16:04:08,593 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 2999: Unexpected internal error. could not instantiate
> com.sabre.pigshop.ShoppingReader' with arguments '[all]'
> >>
> >> Thanks,
> >> Alan.
> >>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB