Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Running e2e RubyUDFs test in MR mode


Copy link to this message
-
Re: Running e2e RubyUDFs test in MR mode
Cheolsoo Park 2012-06-09, 19:21
Hi Subir,

Thanks for asking. In fact, I found out what's the issue and filed a jira:
https://issues.apache.org/jira/browse/PIG-2745. Please find details from
the jira.

Cheolsoo

On Sat, Jun 9, 2012 at 5:42 AM, Subir S <[EMAIL PROTECTED]> wrote:

>  can you pls share a snippet on how you are using these udfs?
>
> On Fri, Jun 8, 2012 at 6:01 AM, Cheolsoo Park <[EMAIL PROTECTED]>
> wrote:
>
> > Hello,
> >
> > I checked out branch-0.10, and I am trying to run e2e RubyUDFs tests in
> MR
> > mode. But I am getting the following error:
> >
> > java.lang.IllegalStateException: *Could not initialize interpreter (from
> > > file system or classpath) with
> > >
> >
> /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/ruby/scriptingudfs.rb
> > > *
> > >         at
> > >
> >
> org.apache.pig.scripting.ScriptEngine.getScriptAsStream(ScriptEngine.java:145)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFromCache(JrubyScriptEngine.java:104)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFunctions(JrubyScriptEngine.java:120)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyEvalFunc.initialize(JrubyEvalFunc.java:87)
> > >         at
> > >
> org.apache.pig.scripting.jruby.JrubyEvalFunc.exec(JrubyEvalFunc.java:103)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:328)
> >
> >
> > Looking at the source code (ScriptEngine.java), I found
> > that scriptingudfs.rb should be found via classpath:
> >
> >        if (file.exists()) {
> > >             try {
> > >                 is = new FileInputStream(file);
> > >             } catch (FileNotFoundException e) {
> > >                 throw new IllegalStateException("could not find
> existing
> > > file "+scriptPath, e);
> > >             }
> > >         } else {
> > >             if (file.isAbsolute()) {
> > >                 *is > > ScriptEngine.class.getResourceAsStream(scriptPath);*
> > >             } else {
> > >                 is = ScriptEngine.class.getResourceAsStream("/" +
> > > scriptPath);
> > >             }
> > >         }
> >
> >
> > Now I looked at the Job jar generated by Pig and found that
> > scriptingudfs.rb indeed exists in that jar:
> >
> >  cheolsoo@localhost:~/workspace/pig-cheolsoo $jar tvf
> > > Job9203441412304345930.jar | grep scriptingudfs.rb
> > >   2491 Thu Jun 07 14:42:44 PDT 2012 *
> > > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/scriptingudfs.rb*
> >
> >
> > Since scriptingudfs.rb is inside the Job jar, I imagine that
> > getResourceAsStream() should be able to find it, but apparently it
> doesn't.
> >
> > I am wondering if anyone was able to run these test in MR mode and could
> > provide some pointers to me. Any help would be appreciated!
> >
> > Thanks,
> > Cheolsoo
> >
> > p.s. The test works fine in local mode, which is not surprising
> > since scriptingudfs.rb would be found via file system. I also see a
> similar
> > issue with e2e Jython tests where Jython scripts are not found with
> > following error:
> >
> > 2012-06-05 22:44:19,491 [main] INFO
> > >
> >
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Failed!
> > > 2012-06-05 22:44:19,513 [main] ERROR
> > > org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate
> > > exception from backed error: java.io.IOException: Deserialization
> error:
> > > could not instantiate 'org.apache.pig.scripting.jython.JythonFunction'
> > with
> > > arguments
> > >
> >
> '[/home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/python/scriptingudf.py,
> > > square]'