Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # dev - Custom Scripting Engine


+
Connor Woodson 2013-01-19, 02:42
+
Daniel Dai 2013-01-21, 21:59
+
Connor Woodson 2013-01-22, 00:47
+
Jonathan Coveney 2013-01-22, 01:04
+
Connor Woodson 2013-01-22, 02:22
+
Jonathan Coveney 2013-01-22, 23:56
Copy link to this message
-
Re: Custom Scripting Engine
Connor Woodson 2013-01-23, 00:15
I'm starting work on an R scripting engine; I'm not entirely sure how it
will be used, but I know that there have been attempts to get R working
with MapReduce / EMR and I thought it would be cool to do that through Pig.
(One fun use case might be to generate plots/graphs during the MR job (then
do something with them))

The easy answer for how to get this working with Pig is to just stick new
scripting engines with the existing ones and update the ScriptingEngine
enum to include those; however, I would like to use this in EMR which
doesn't update its software regularly and so I was hoping there was some
hook to get this scripting engine called, but it looks like it'll just have
to be used for UDFs for now.

If a change is going to be made, I think what would be helpful is a change
in how the ScriptingEngine decides which subclass  to call; right now (from
what I can tell) it will only look at the file suffix or the #! first line
of the script and try and match those with its internal list. Maybe allow
an annotation like
#@ <FQCN of a ScriptingEngine>
as the first line of a script to force Pig to use a specific engine.

- Connor
On Tue, Jan 22, 2013 at 3:56 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> So, something like this is not currently possible, but I think it would be
> possible to expose a set of interfaces that would make this possible. That
> said, why is this desirable? Is your goal to override one of the existing
> SE's, or something? I could imagine reworking things so that anyone can
> register an arbitrary SE, and then we can implement the current SE's in
> terms of that interface. That said, I'm not sure of a compelling reason to
> do this, and would love a use case.
>
> I worked on the JRuby implementation and reviewed the Groovy one and think
> that we could be doing a lot more with scripting languages, so you have my
> attention.
>
>
> 2013/1/21 Connor Woodson <[EMAIL PROTECTED]>
>
> > I want to write a custom scripting engine and I would like to not have to
> > modify the enum in ScriptingEngine.java to get it to work both in the
> > 'register' command for UDFs, but also for embedded scripts. From what I
> can
> > tell, the former is possible by passing in a FQCN to the register command
> > instead of one of the keywords; however, I can't tell if it is possible
> to
> > get Pig to run my scripting engine when I pass it a non-pig file (e.g.
> you
> > pass it a .py file and it runs the jython scripting engine). So is this
> > second use possible, or (for now) can custom SE's only be used for UDFs?
> >
> > (I'll admit here that I don't understand what I meant in the end of my
> > previous email; feel free to ignore it).
> >
> > Thanks,
> >
> > - Connor
> >
> >
> > On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Can you describe at a higher level what you have in mind?
> > >
> > >
> > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]>
> > >
> > > > Is there a way to get Pig to use your custom scripting engine without
> > > > having to modify ScriptingEngine.java and placing it in the enum? It
> > > looks
> > > > like it's possible with enums, but what about for embedding pig? (as
> in
> > > how
> > > > Pig can run python scripts).
> > > >
> > > > - Connor
> > > >
> > > >
> > > > On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]>
> > > wrote:
> > > >
> > > > > Pig currently support jython, jruby, javascript and groovy. If you
> > > > > need to write other scripting engine, extend ScriptEngine.
> > > > >
> > > > > Here are some references:
> > > > > 1.
> > > > >
> > > >
> > >
> >
> http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig
> > > > > (pp 24, 25)
> > > > > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763
> > > > > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317
> > > > > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794
> > > > >
> > > > > Thanks,
> > > > > Daniel
+
Jonathan Coveney 2013-01-23, 00:44
+
Connor Woodson 2013-01-23, 01:10