-Re: UDF with dependency on external jars & native code
Thejas M Nair 2010-07-29, 17:46
You can use the MR distributed cache to push the native libs - see -
"The DistributedCache can also be used to distribute both jars and native
libraries for use in the map and/or reduce tasks. The child-jvm always has
its current working directory added to the java.library.path and
LD_LIBRARY_PATH. And hence the cached libraries can be loaded via
System.loadLibrary or System.load . More details on how to load shared
libraries through distributed cache are documented at
So using Dmapred.cache.files=<dfs path to file>, in your pig commandline
Please let us know if this worked for you.
For the jars, you can also use a commandline option -
(thanks to Pradeep for suggesting this solution)
On 7/26/10 9:38 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]> wrote:
> I am new to PIG and running into a fairly basic problem. I have a UDF
> which depends on some other 3rd party jars & libraries. I can call the
> UDF from my PIG script either from grunt or by running "java -cp ...
> org.apache.pig.Main <script>" in local mode, when I have the jars on the
> classpath and the libraries on LD_LIBRARY_PATH. But, in mapreduce mode I
> get errors from Hadoop because it doesn't find the classes & libraries.
> I saw another thread on this forum, which had a workaround for the jar.
> I can explicitly call register on the dependency, and that seems to fix
> the problem. But, there doesn't seem to be a way of specifying the
> native libraries to PIG such that the map/reduce jobs are set up to
> access them.
> I am using PIG 0.5.0. Any help is appreciated!