|
|
-
Running e2e RubyUDFs test in MR modeCheolsoo Park 2012-06-08, 00:31
Hello,
I checked out branch-0.10, and I am trying to run e2e RubyUDFs tests in MR mode. But I am getting the following error: java.lang.IllegalStateException: *Could not initialize interpreter (from > file system or classpath) with > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/ruby/scriptingudfs.rb > * > at > org.apache.pig.scripting.ScriptEngine.getScriptAsStream(ScriptEngine.java:145) > at > org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFromCache(JrubyScriptEngine.java:104) > at > org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFunctions(JrubyScriptEngine.java:120) > at > org.apache.pig.scripting.jruby.JrubyEvalFunc.initialize(JrubyEvalFunc.java:87) > at > org.apache.pig.scripting.jruby.JrubyEvalFunc.exec(JrubyEvalFunc.java:103) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263) > at > org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:328) Looking at the source code (ScriptEngine.java), I found that scriptingudfs.rb should be found via classpath: if (file.exists()) { > try { > is = new FileInputStream(file); > } catch (FileNotFoundException e) { > throw new IllegalStateException("could not find existing > file "+scriptPath, e); > } > } else { > if (file.isAbsolute()) { > *is = ScriptEngine.class.getResourceAsStream(scriptPath);* > } else { > is = ScriptEngine.class.getResourceAsStream("/" + > scriptPath); > } > } Now I looked at the Job jar generated by Pig and found that scriptingudfs.rb indeed exists in that jar: cheolsoo@localhost:~/workspace/pig-cheolsoo $jar tvf > Job9203441412304345930.jar | grep scriptingudfs.rb > 2491 Thu Jun 07 14:42:44 PDT 2012 * > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/scriptingudfs.rb* Since scriptingudfs.rb is inside the Job jar, I imagine that getResourceAsStream() should be able to find it, but apparently it doesn't. I am wondering if anyone was able to run these test in MR mode and could provide some pointers to me. Any help would be appreciated! Thanks, Cheolsoo p.s. The test works fine in local mode, which is not surprising since scriptingudfs.rb would be found via file system. I also see a similar issue with e2e Jython tests where Jython scripts are not found with following error: 2012-06-05 22:44:19,491 [main] INFO > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - Failed! > 2012-06-05 22:44:19,513 [main] ERROR > org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate > exception from backed error: java.io.IOException: Deserialization error: > could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with > arguments > '[/home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/python/scriptingudf.py, > square]' > |