Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Importing python modules in embedded pig

Chun Yang 2012-06-16, 01:30
Copy link to this message
Re: Importing python modules in embedded pig
I see subprocess problem before. This is because we bundle jython.jar
instead of jython-standalone.jar, see PIG-2665.

On Fri, Jun 15, 2012 at 6:30 PM, Chun Yang

> Hi all,
> I'm trying to run the mahout canopy clustering algorithm through a
> Python-embedded Pig script. The embedded Pig part of the script works
> (using
> compileFromFile, bind, runSingle), but I can't figure out how to run mahout
> from the same script. Originally I tried running mahout via
> subprocess.call,
> but when trying to import subprocess, I get:
> ImportError: No module named subprocess
> Similar errors occur when I try to import sys or os modules.
> Next I tried just instantiating the CanopyClustering class, but got a
> similar error when using the following import statement:
> from org.apache.mahout.clustering.canopy import CanopyDriver
> #=> ImportError: No module named mahout
> The ImportErrors don't occur when I run Python interactively. Is this a
> Jython problem? Am I not setting some path properly?
> Other possibly useful info:
> - I'm including the mahout jars in the pig.additional.jars property.
> - I'm running the script using Pig, i.e., `pig myscript.py`
> Thanks,
> Chun
Chun Yang 2012-06-18, 16:56