Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - Custom Scripting Engine


Copy link to this message
-
Re: Custom Scripting Engine
Jonathan Coveney 2013-01-23, 00:44
Ahhh, I see. That makes sense. Sadly, this won't currently be possible in
the current version of Pig, but this is a really good reason to want to do
this. Can you make a ticket about making it possible to plug in
ScriptingEngines without having a make a code change to Pig? I think this
would be useful for this reason.

That said, if you dig down into how these implementations work, they are
based on EvalFunc's, so manually making UDF's to do it is an annoyance, but
functionally quite similar.

Question about R: is there a JVM implementation, or are you shelling out?
2013/1/22 Connor Woodson <[EMAIL PROTECTED]>

> I'm starting work on an R scripting engine; I'm not entirely sure how it
> will be used, but I know that there have been attempts to get R working
> with MapReduce / EMR and I thought it would be cool to do that through Pig.
> (One fun use case might be to generate plots/graphs during the MR job (then
> do something with them))
>
> The easy answer for how to get this working with Pig is to just stick new
> scripting engines with the existing ones and update the ScriptingEngine
> enum to include those; however, I would like to use this in EMR which
> doesn't update its software regularly and so I was hoping there was some
> hook to get this scripting engine called, but it looks like it'll just have
> to be used for UDFs for now.
>
> If a change is going to be made, I think what would be helpful is a change
> in how the ScriptingEngine decides which subclass  to call; right now (from
> what I can tell) it will only look at the file suffix or the #! first line
> of the script and try and match those with its internal list. Maybe allow
> an annotation like
> #@ <FQCN of a ScriptingEngine>
> as the first line of a script to force Pig to use a specific engine.
>
> - Connor
>
>
> On Tue, Jan 22, 2013 at 3:56 PM, Jonathan Coveney <[EMAIL PROTECTED]
> >wrote:
>
> > So, something like this is not currently possible, but I think it would
> be
> > possible to expose a set of interfaces that would make this possible.
> That
> > said, why is this desirable? Is your goal to override one of the existing
> > SE's, or something? I could imagine reworking things so that anyone can
> > register an arbitrary SE, and then we can implement the current SE's in
> > terms of that interface. That said, I'm not sure of a compelling reason
> to
> > do this, and would love a use case.
> >
> > I worked on the JRuby implementation and reviewed the Groovy one and
> think
> > that we could be doing a lot more with scripting languages, so you have
> my
> > attention.
> >
> >
> > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]>
> >
> > > I want to write a custom scripting engine and I would like to not have
> to
> > > modify the enum in ScriptingEngine.java to get it to work both in the
> > > 'register' command for UDFs, but also for embedded scripts. From what I
> > can
> > > tell, the former is possible by passing in a FQCN to the register
> command
> > > instead of one of the keywords; however, I can't tell if it is possible
> > to
> > > get Pig to run my scripting engine when I pass it a non-pig file (e.g.
> > you
> > > pass it a .py file and it runs the jython scripting engine). So is this
> > > second use possible, or (for now) can custom SE's only be used for
> UDFs?
> > >
> > > (I'll admit here that I don't understand what I meant in the end of my
> > > previous email; feel free to ignore it).
> > >
> > > Thanks,
> > >
> > > - Connor
> > >
> > >
> > > On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED]
> > > >wrote:
> > >
> > > > Can you describe at a higher level what you have in mind?
> > > >
> > > >
> > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]>
> > > >
> > > > > Is there a way to get Pig to use your custom scripting engine
> without
> > > > > having to modify ScriptingEngine.java and placing it in the enum?
> It
> > > > looks
> > > > > like it's possible with enums, but what about for embedding pig?
> (as