Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - Counters from Python UDF


+
Duckworth, Will 2012-08-17, 14:03
+
Aniket Mokashi 2012-08-17, 21:53
+
Duckworth, Will 2012-08-23, 21:28
+
Jonathan Coveney 2012-08-23, 21:43
+
Aniket Mokashi 2012-08-24, 17:10
+
Jonathan Coveney 2012-08-24, 17:30
Copy link to this message
-
RE: Counters from Python UDF
Duckworth, Will 2012-08-25, 02:58
Code below works against trunk.

Apache Pig version 0.11.0-SNAPSHOT (r1372967)
compiled Aug 14 2012, 15:31:10

pig -f test_counter.pig -p in_path=/path/to/file/test_file.gz -p job_name=counter_test

*** test_counter.py
from org.apache.pig.tools.counters import PigCounterHelper

@outputSchema("line:chararray")
def testCounter(line):
        counter = PigCounterHelper()
        counter.incrCounter("Test","udfcounter",1)
        return line

*** test_counter.pig
-- $in_path
-- $job_name

SET job.name '$job_name';

REGISTER '/path/to/python_file/test_counter.py' USING jython AS udf;

A = load '$in_path' using PigStorage('\n') as (line:chararray);

A2 = foreach A generate udf.testCounter(line) as line;
A3 = limit A2 10;
dump A3;
Will Duckworth  Senior Vice President, Software Engineering  | comScore, Inc.(NASDAQ:SCOR)
o +1 (703) 438-2108 | m +1 (301) 606-2977 | mailto:[EMAIL PROTECTED]
.....................................................................................................
-----Original Message-----
From: Jonathan Coveney [mailto:[EMAIL PROTECTED]]
Sent: Friday, August 24, 2012 1:31 PM
To: [EMAIL PROTECTED]
Subject: Re: Counters from Python UDF

I think adding a method to jython/jruby is absolutely the way to go

2012/8/24 Aniket Mokashi <[EMAIL PROTECTED]>

> I used following in my python udf (on pig 0.9) after referring to -
>
> http://squarecog.wordpress.com/2010/12/24/incrementing-hadoop-counters
> -in-apache-pig/
>
>
> from org.apache.pig.tools.pigstats import PigStatusReporter reporter > PigStatusReporter.getInstance();
>
> But, looks like, context is not set in pigreporter when udf is
> invoked, so it fails. I think we need some caching logic similar to
> PigCountersHelper, until something sets the context in
> PigCountersHelper. I wonder how this works.
>
> We can add a helper udf at JythonScriptingEngine.init (or some such)
> method to expose these elegantly. Thoughts?
>
> ~Aniket
>
> On Thu, Aug 23, 2012 at 2:43 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
>
> > In trunk this should be possible (it's possible in 0.10 as well, I
> > just
> am
> > not sure if PigCountersHelper is there). Either way, take a look at
> > PigCountersHelper. All you have to do is instantiate a copy in your
> > UDF
> and
> > use it from there.
> >
> > This hinges on all of the static stuff that Pig relies on working...
> > I think that the way that we invoke these scripting languages should
> > work, but this will verify that :)
> >
> > 2012/8/23 Duckworth, Will <[EMAIL PROTECTED]>
> >
> > > This may be a better question for the DEV list but ... Is it even
> > possible
> > > / feasible.  Could it be done by calling the Java classes from
> > > within Jython?
> > >
> > > I guess I would ask the same about algebraic and accumulator UDF
> > > which
> I
> > > know are available in Ruby.
> > >
> > > -----Original Message-----
> > > From: Aniket Mokashi [mailto:[EMAIL PROTECTED]]
> > > Sent: Friday, August 17, 2012 5:54 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Counters from Python UDF
> > >
> > > I dont think there is a way at this point. You may have to open a jira.
> > >
> > > Thanks,
> > > Aniket
> > >
> > > On Fri, Aug 17, 2012 at 7:03 AM, Duckworth, Will <
> > [EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Has anyone poked around to see if there is there a way to create
> > > > / increment counters from a Python UDFs?  Thanks.
> > > >
> > > >
> > > >
> > > > Will Duckworth Senior Vice President, Software Engineering |
> comScore,
> > > > Inc. (NASDAQ:SCOR)
> > > >
> > > > o +1 (703) 438-2108 | m +1 (301) 606-2977 |
> > > > [EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>
> > > >
> > > >
> > > >
> > >
> >
> ...........................................................................................................
> > > >
> > > > Introducing Mobile Metrix 2.0 - The next generation of mobile
> > > > behavioral measurement www.comscore.com/MobileMetrix<
> > > >
> http://www.comscore.com/Products_Services/Product_Index/Mobile_Metrix_
+
Jonathan Coveney 2012-08-27, 21:09