Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Running e2e RubyUDFs test in MR mode


Copy link to this message
-
Re: Running e2e RubyUDFs test in MR mode
Hi Subir,

Thanks for asking. In fact, I found out what's the issue and filed a jira:
https://issues.apache.org/jira/browse/PIG-2745. Please find details from
the jira.

Cheolsoo

On Sat, Jun 9, 2012 at 5:42 AM, Subir S <[EMAIL PROTECTED]> wrote:

>  can you pls share a snippet on how you are using these udfs?
>
> On Fri, Jun 8, 2012 at 6:01 AM, Cheolsoo Park <[EMAIL PROTECTED]>
> wrote:
>
> > Hello,
> >
> > I checked out branch-0.10, and I am trying to run e2e RubyUDFs tests in
> MR
> > mode. But I am getting the following error:
> >
> > java.lang.IllegalStateException: *Could not initialize interpreter (from
> > > file system or classpath) with
> > >
> >
> /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/ruby/scriptingudfs.rb
> > > *
> > >         at
> > >
> >
> org.apache.pig.scripting.ScriptEngine.getScriptAsStream(ScriptEngine.java:145)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFromCache(JrubyScriptEngine.java:104)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyScriptEngine$RubyFunctions.getFunctions(JrubyScriptEngine.java:120)
> > >         at
> > >
> >
> org.apache.pig.scripting.jruby.JrubyEvalFunc.initialize(JrubyEvalFunc.java:87)
> > >         at
> > >
> org.apache.pig.scripting.jruby.JrubyEvalFunc.exec(JrubyEvalFunc.java:103)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:216)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
> > >         at
> > >
> >
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:328)
> >
> >
> > Looking at the source code (ScriptEngine.java), I found
> > that scriptingudfs.rb should be found via classpath:
> >
> >        if (file.exists()) {
> > >             try {
> > >                 is = new FileInputStream(file);
> > >             } catch (FileNotFoundException e) {
> > >                 throw new IllegalStateException("could not find
> existing
> > > file "+scriptPath, e);
> > >             }
> > >         } else {
> > >             if (file.isAbsolute()) {
> > >                 *is > > ScriptEngine.class.getResourceAsStream(scriptPath);*
> > >             } else {
> > >                 is = ScriptEngine.class.getResourceAsStream("/" +
> > > scriptPath);
> > >             }
> > >         }
> >
> >
> > Now I looked at the Job jar generated by Pig and found that
> > scriptingudfs.rb indeed exists in that jar:
> >
> >  cheolsoo@localhost:~/workspace/pig-cheolsoo $jar tvf
> > > Job9203441412304345930.jar | grep scriptingudfs.rb
> > >   2491 Thu Jun 07 14:42:44 PDT 2012 *
> > > /home/cheolsoo/pig-0.10/test/e2e/pig/testdist/scriptingudfs.rb*
> >
> >
> > Since scriptingudfs.rb is inside the Job jar, I imagine that
> > getResourceAsStream() should be able to find it, but apparently it
> doesn't.
> >
> > I am wondering if anyone was able to run these test in MR mode and could
> > provide some pointers to me. Any help would be appreciated!
> >
> > Thanks,
> > Cheolsoo
> >
> > p.s. The test works fine in local mode, which is not surprising
> > since scriptingudfs.rb would be found via file system. I also see a
> similar
> > issue with e2e Jython tests where Jython scripts are not found with
> > following error:
> >
> > 2012-06-05 22:44:19,491 [main] INFO
> > >
> >
>  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> > > - Failed!
> > > 2012-06-05 22:44:19,513 [main] ERROR
> > > org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate
> > > exception from backed error: java.io.IOException: Deserialization
> error:
> > > could not instantiate 'org.apache.pig.scripting.jython.JythonFunction'
> > with
> > > arguments
> > >
> >
> '[/home/cheolsoo/pig-0.10/test/e2e/pig/testdist/libexec/python/scriptingudf.py,
> > > square]'
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB