|
Connor Woodson
2013-01-19, 02:42
Daniel Dai
2013-01-21, 21:59
Connor Woodson
2013-01-22, 00:47
Jonathan Coveney
2013-01-22, 01:04
Connor Woodson
2013-01-22, 02:22
Jonathan Coveney
2013-01-22, 23:56
Connor Woodson
2013-01-23, 00:15
Jonathan Coveney
2013-01-23, 00:44
Connor Woodson
2013-01-23, 01:10
|
-
Custom Scripting EngineConnor Woodson 2013-01-19, 02:42
Is there any support for a custom scripting engine, to allow UDFs to be
written in a different language / embed pig in another language? - Connor
-
Re: Custom Scripting EngineDaniel Dai 2013-01-21, 21:59
Pig currently support jython, jruby, javascript and groovy. If you
need to write other scripting engine, extend ScriptEngine. Here are some references: 1. http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig (pp 24, 25) 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 Thanks, Daniel On Fri, Jan 18, 2013 at 6:42 PM, Connor Woodson <[EMAIL PROTECTED]> wrote: > Is there any support for a custom scripting engine, to allow UDFs to be > written in a different language / embed pig in another language? > > - Connor
-
Re: Custom Scripting EngineConnor Woodson 2013-01-22, 00:47
Is there a way to get Pig to use your custom scripting engine without
having to modify ScriptingEngine.java and placing it in the enum? It looks like it's possible with enums, but what about for embedding pig? (as in how Pig can run python scripts). - Connor On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > Pig currently support jython, jruby, javascript and groovy. If you > need to write other scripting engine, extend ScriptEngine. > > Here are some references: > 1. > http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig > (pp 24, 25) > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 > > Thanks, > Daniel > > On Fri, Jan 18, 2013 at 6:42 PM, Connor Woodson <[EMAIL PROTECTED]> > wrote: > > Is there any support for a custom scripting engine, to allow UDFs to be > > written in a different language / embed pig in another language? > > > > - Connor >
-
Re: Custom Scripting EngineJonathan Coveney 2013-01-22, 01:04
Can you describe at a higher level what you have in mind?
2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > Is there a way to get Pig to use your custom scripting engine without > having to modify ScriptingEngine.java and placing it in the enum? It looks > like it's possible with enums, but what about for embedding pig? (as in how > Pig can run python scripts). > > - Connor > > > On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > > > Pig currently support jython, jruby, javascript and groovy. If you > > need to write other scripting engine, extend ScriptEngine. > > > > Here are some references: > > 1. > > > http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig > > (pp 24, 25) > > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 > > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 > > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 > > > > Thanks, > > Daniel > > > > On Fri, Jan 18, 2013 at 6:42 PM, Connor Woodson <[EMAIL PROTECTED]> > > wrote: > > > Is there any support for a custom scripting engine, to allow UDFs to be > > > written in a different language / embed pig in another language? > > > > > > - Connor > > >
-
Re: Custom Scripting EngineConnor Woodson 2013-01-22, 02:22
I want to write a custom scripting engine and I would like to not have to
modify the enum in ScriptingEngine.java to get it to work both in the 'register' command for UDFs, but also for embedded scripts. From what I can tell, the former is possible by passing in a FQCN to the register command instead of one of the keywords; however, I can't tell if it is possible to get Pig to run my scripting engine when I pass it a non-pig file (e.g. you pass it a .py file and it runs the jython scripting engine). So is this second use possible, or (for now) can custom SE's only be used for UDFs? (I'll admit here that I don't understand what I meant in the end of my previous email; feel free to ignore it). Thanks, - Connor On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote: > Can you describe at a higher level what you have in mind? > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > Is there a way to get Pig to use your custom scripting engine without > > having to modify ScriptingEngine.java and placing it in the enum? It > looks > > like it's possible with enums, but what about for embedding pig? (as in > how > > Pig can run python scripts). > > > > - Connor > > > > > > On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > > > Pig currently support jython, jruby, javascript and groovy. If you > > > need to write other scripting engine, extend ScriptEngine. > > > > > > Here are some references: > > > 1. > > > > > > http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig > > > (pp 24, 25) > > > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 > > > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 > > > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 > > > > > > Thanks, > > > Daniel > > > > > > On Fri, Jan 18, 2013 at 6:42 PM, Connor Woodson < > [EMAIL PROTECTED]> > > > wrote: > > > > Is there any support for a custom scripting engine, to allow UDFs to > be > > > > written in a different language / embed pig in another language? > > > > > > > > - Connor > > > > > >
-
Re: Custom Scripting EngineJonathan Coveney 2013-01-22, 23:56
So, something like this is not currently possible, but I think it would be
possible to expose a set of interfaces that would make this possible. That said, why is this desirable? Is your goal to override one of the existing SE's, or something? I could imagine reworking things so that anyone can register an arbitrary SE, and then we can implement the current SE's in terms of that interface. That said, I'm not sure of a compelling reason to do this, and would love a use case. I worked on the JRuby implementation and reviewed the Groovy one and think that we could be doing a lot more with scripting languages, so you have my attention. 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > I want to write a custom scripting engine and I would like to not have to > modify the enum in ScriptingEngine.java to get it to work both in the > 'register' command for UDFs, but also for embedded scripts. From what I can > tell, the former is possible by passing in a FQCN to the register command > instead of one of the keywords; however, I can't tell if it is possible to > get Pig to run my scripting engine when I pass it a non-pig file (e.g. you > pass it a .py file and it runs the jython scripting engine). So is this > second use possible, or (for now) can custom SE's only be used for UDFs? > > (I'll admit here that I don't understand what I meant in the end of my > previous email; feel free to ignore it). > > Thanks, > > - Connor > > > On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED] > >wrote: > > > Can you describe at a higher level what you have in mind? > > > > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > > > Is there a way to get Pig to use your custom scripting engine without > > > having to modify ScriptingEngine.java and placing it in the enum? It > > looks > > > like it's possible with enums, but what about for embedding pig? (as in > > how > > > Pig can run python scripts). > > > > > > - Connor > > > > > > > > > On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]> > > wrote: > > > > > > > Pig currently support jython, jruby, javascript and groovy. If you > > > > need to write other scripting engine, extend ScriptEngine. > > > > > > > > Here are some references: > > > > 1. > > > > > > > > > > http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig > > > > (pp 24, 25) > > > > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 > > > > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 > > > > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 > > > > > > > > Thanks, > > > > Daniel > > > > > > > > On Fri, Jan 18, 2013 at 6:42 PM, Connor Woodson < > > [EMAIL PROTECTED]> > > > > wrote: > > > > > Is there any support for a custom scripting engine, to allow UDFs > to > > be > > > > > written in a different language / embed pig in another language? > > > > > > > > > > - Connor > > > > > > > > > >
-
Re: Custom Scripting EngineConnor Woodson 2013-01-23, 00:15
I'm starting work on an R scripting engine; I'm not entirely sure how it
will be used, but I know that there have been attempts to get R working with MapReduce / EMR and I thought it would be cool to do that through Pig. (One fun use case might be to generate plots/graphs during the MR job (then do something with them)) The easy answer for how to get this working with Pig is to just stick new scripting engines with the existing ones and update the ScriptingEngine enum to include those; however, I would like to use this in EMR which doesn't update its software regularly and so I was hoping there was some hook to get this scripting engine called, but it looks like it'll just have to be used for UDFs for now. If a change is going to be made, I think what would be helpful is a change in how the ScriptingEngine decides which subclass to call; right now (from what I can tell) it will only look at the file suffix or the #! first line of the script and try and match those with its internal list. Maybe allow an annotation like #@ <FQCN of a ScriptingEngine> as the first line of a script to force Pig to use a specific engine. - Connor On Tue, Jan 22, 2013 at 3:56 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote: > So, something like this is not currently possible, but I think it would be > possible to expose a set of interfaces that would make this possible. That > said, why is this desirable? Is your goal to override one of the existing > SE's, or something? I could imagine reworking things so that anyone can > register an arbitrary SE, and then we can implement the current SE's in > terms of that interface. That said, I'm not sure of a compelling reason to > do this, and would love a use case. > > I worked on the JRuby implementation and reviewed the Groovy one and think > that we could be doing a lot more with scripting languages, so you have my > attention. > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > I want to write a custom scripting engine and I would like to not have to > > modify the enum in ScriptingEngine.java to get it to work both in the > > 'register' command for UDFs, but also for embedded scripts. From what I > can > > tell, the former is possible by passing in a FQCN to the register command > > instead of one of the keywords; however, I can't tell if it is possible > to > > get Pig to run my scripting engine when I pass it a non-pig file (e.g. > you > > pass it a .py file and it runs the jython scripting engine). So is this > > second use possible, or (for now) can custom SE's only be used for UDFs? > > > > (I'll admit here that I don't understand what I meant in the end of my > > previous email; feel free to ignore it). > > > > Thanks, > > > > - Connor > > > > > > On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED] > > >wrote: > > > > > Can you describe at a higher level what you have in mind? > > > > > > > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > > > > > Is there a way to get Pig to use your custom scripting engine without > > > > having to modify ScriptingEngine.java and placing it in the enum? It > > > looks > > > > like it's possible with enums, but what about for embedding pig? (as > in > > > how > > > > Pig can run python scripts). > > > > > > > > - Connor > > > > > > > > > > > > On Mon, Jan 21, 2013 at 1:59 PM, Daniel Dai <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > Pig currently support jython, jruby, javascript and groovy. If you > > > > > need to write other scripting engine, extend ScriptEngine. > > > > > > > > > > Here are some references: > > > > > 1. > > > > > > > > > > > > > > > http://www.slideshare.net/daijy/pig-programming-is-more-fun-new-features-in-pig > > > > > (pp 24, 25) > > > > > 2. Groovy UDF: https://issues.apache.org/jira/browse/PIG-2763 > > > > > 3. JRuby UDF: https://issues.apache.org/jira/browse/PIG-2317 > > > > > 4. Javascript UDF: https://issues.apache.org/jira/browse/PIG-1794 > > > > > > > > > > Thanks, > > > > > Daniel
-
Re: Custom Scripting EngineJonathan Coveney 2013-01-23, 00:44
Ahhh, I see. That makes sense. Sadly, this won't currently be possible in
the current version of Pig, but this is a really good reason to want to do this. Can you make a ticket about making it possible to plug in ScriptingEngines without having a make a code change to Pig? I think this would be useful for this reason. That said, if you dig down into how these implementations work, they are based on EvalFunc's, so manually making UDF's to do it is an annoyance, but functionally quite similar. Question about R: is there a JVM implementation, or are you shelling out? 2013/1/22 Connor Woodson <[EMAIL PROTECTED]> > I'm starting work on an R scripting engine; I'm not entirely sure how it > will be used, but I know that there have been attempts to get R working > with MapReduce / EMR and I thought it would be cool to do that through Pig. > (One fun use case might be to generate plots/graphs during the MR job (then > do something with them)) > > The easy answer for how to get this working with Pig is to just stick new > scripting engines with the existing ones and update the ScriptingEngine > enum to include those; however, I would like to use this in EMR which > doesn't update its software regularly and so I was hoping there was some > hook to get this scripting engine called, but it looks like it'll just have > to be used for UDFs for now. > > If a change is going to be made, I think what would be helpful is a change > in how the ScriptingEngine decides which subclass to call; right now (from > what I can tell) it will only look at the file suffix or the #! first line > of the script and try and match those with its internal list. Maybe allow > an annotation like > #@ <FQCN of a ScriptingEngine> > as the first line of a script to force Pig to use a specific engine. > > - Connor > > > On Tue, Jan 22, 2013 at 3:56 PM, Jonathan Coveney <[EMAIL PROTECTED] > >wrote: > > > So, something like this is not currently possible, but I think it would > be > > possible to expose a set of interfaces that would make this possible. > That > > said, why is this desirable? Is your goal to override one of the existing > > SE's, or something? I could imagine reworking things so that anyone can > > register an arbitrary SE, and then we can implement the current SE's in > > terms of that interface. That said, I'm not sure of a compelling reason > to > > do this, and would love a use case. > > > > I worked on the JRuby implementation and reviewed the Groovy one and > think > > that we could be doing a lot more with scripting languages, so you have > my > > attention. > > > > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > > > I want to write a custom scripting engine and I would like to not have > to > > > modify the enum in ScriptingEngine.java to get it to work both in the > > > 'register' command for UDFs, but also for embedded scripts. From what I > > can > > > tell, the former is possible by passing in a FQCN to the register > command > > > instead of one of the keywords; however, I can't tell if it is possible > > to > > > get Pig to run my scripting engine when I pass it a non-pig file (e.g. > > you > > > pass it a .py file and it runs the jython scripting engine). So is this > > > second use possible, or (for now) can custom SE's only be used for > UDFs? > > > > > > (I'll admit here that I don't understand what I meant in the end of my > > > previous email; feel free to ignore it). > > > > > > Thanks, > > > > > > - Connor > > > > > > > > > On Mon, Jan 21, 2013 at 5:04 PM, Jonathan Coveney <[EMAIL PROTECTED] > > > >wrote: > > > > > > > Can you describe at a higher level what you have in mind? > > > > > > > > > > > > 2013/1/21 Connor Woodson <[EMAIL PROTECTED]> > > > > > > > > > Is there a way to get Pig to use your custom scripting engine > without > > > > > having to modify ScriptingEngine.java and placing it in the enum? > It > > > > looks > > > > > like it's possible with enums, but what about for embedding pig? > (as
-
Re: Custom Scripting EngineConnor Woodson 2013-01-23, 01:10
There are two ways to go about using R with java (that I've found). Both
are a little bit of a hassle depending on your setup. JRI is a JNI for R, so you don't need R installed on the machine for it to work. But you do need to include a set of DLLs in the classpath; the best way I've found to do this is to bundle the dll's in the .jar and then copy them to the local directory at runtime (as copying them elsewhere and changing java.library.path won't work). There are some features missing from JRI, though, especially the ability for multiple environments/sessions; I don't quite yet have down a plan for the R/Pig integration, but having sessions might be useful. The other method is through Rserve, which is both a java package and an application; the application sets up an R server that by default allows only a single connection from a local machine (if you wanted, each map-reduce job could connect to the same R server/instance, but I don't think that's useful). To start this up, you would need R installed and then run Rserve. In EMR, this would be possible as it does have R, so you would just need a bootstrap script to start R. Optionally, it is probably possible to tell Rserve to start from within java, but that's much trickier. I prefer the first method as it eliminates the requirement of having R installed; however, I'm hoping to implement both (for Rserve, I'll require that the server is already started; and maybe include an option for connecting to a specific server). I don't have a clear vision of how R/Pig will interact; it will have to be something different than Python or JScript, but I don't know how different. I want to just scratch out something basic and then try and evolve it from there. I'll go ahead and submit that Jira. Thanks, - Connor On Tue, Jan 22, 2013 at 4:44 PM, Jonathan Coveney <[EMAIL PROTECTED]>wrote: > Ahhh, I see. That makes sense. Sadly, this won't currently be possible in > the current version of Pig, but this is a really good reason to want to do > this. Can you make a ticket about making it possible to plug in > ScriptingEngines without having a make a code change to Pig? I think this > would be useful for this reason. > > That said, if you dig down into how these implementations work, they are > based on EvalFunc's, so manually making UDF's to do it is an annoyance, but > functionally quite similar. > > Question about R: is there a JVM implementation, or are you shelling out? > > > 2013/1/22 Connor Woodson <[EMAIL PROTECTED]> > > > I'm starting work on an R scripting engine; I'm not entirely sure how it > > will be used, but I know that there have been attempts to get R working > > with MapReduce / EMR and I thought it would be cool to do that through > Pig. > > (One fun use case might be to generate plots/graphs during the MR job > (then > > do something with them)) > > > > The easy answer for how to get this working with Pig is to just stick new > > scripting engines with the existing ones and update the ScriptingEngine > > enum to include those; however, I would like to use this in EMR which > > doesn't update its software regularly and so I was hoping there was some > > hook to get this scripting engine called, but it looks like it'll just > have > > to be used for UDFs for now. > > > > If a change is going to be made, I think what would be helpful is a > change > > in how the ScriptingEngine decides which subclass to call; right now > (from > > what I can tell) it will only look at the file suffix or the #! first > line > > of the script and try and match those with its internal list. Maybe allow > > an annotation like > > #@ <FQCN of a ScriptingEngine> > > as the first line of a script to force Pig to use a specific engine. > > > > - Connor > > > > > > On Tue, Jan 22, 2013 at 3:56 PM, Jonathan Coveney <[EMAIL PROTECTED] > > >wrote: > > > > > So, something like this is not currently possible, but I think it would > > be > > > possible to expose a set of interfaces that would make this possible. |