|
|
-
Re: Error when run python streamingThomas Bach 2013-01-23, 14:19
On Wed, Jan 23, 2013 at 01:58:29PM +0800, Dongliang Sun wrote:
> I import a third-party module 'Pandas'. > > It's successful when I directly run the python code. > Also successful when run the pig script in local mode. > > But has error when run pig script in MapReduce, to debug I comment all of > the code expect printing out one line. > Still does not work. > When I comment the 'import pandas', it works. Is Pandas installed in a virtual environment? Then the problem is probably that Pig/Hadoop starts your job in a completely fresh environment: the Python interpreter is invoked from e.g. /usr/bin and doesn't know anything about the packages installed in the site packages path of your virtual environment and fails. How do you invoke your script? Does it start with a shebang? How is the script installed? Can you also provide the full trace-back? You should find it in the error logs of the job. Can you boil down the code to a minimal example that fails? Both, the Python and the Pig code? Regards, Thomas Bach. |