I read the tutorial once more & decided to try out the simplest possible
config that might work, which is copying the necessary native libs using
dist cache (they are in an archive called installer-9.0.2-SNAPSHOT.zip),
and hoping that PIG will take care of copying the jars that are
registered. I also need to set some env variables to get my native code
to work.
So, now I have the following in the mapred-site.xml:
<property>
<name>mapred.cache.archives</name>
<value>hdfs://inarch03.informatica.com:54310/infadoop/installer-9.0.2-SN
APSHOT.zip#infadoop</value>
</property>
<property>
<name>mapred.create.symlink</name>
<value>yes</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value> -Xmx512M -Djava.library.path=infadoop/infa-resources </value>
</property>
<property>
<name>mapred.child.env</name>
<value>LD_LIBRARY_PATH=infadoop/infa-resources,INFA_RESOURCES=infadoop/i
nfa-resources,IMF_CPP_RESOURCE_PATH=infadoop/infa-resources</value>
</property>
With this change the job setup fails, and I see the following error in
the userlogs:
Exception in thread "main" java.lang.NoClassDefFoundError:
Caused by: java.lang.ClassNotFoundException:
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
Could not find the main class: . Program will exit.
-----Original Message-----
From: Thejas M Nair [mailto:[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]> ]
Sent: Thursday, August 05, 2010 12:16 AM
To: [EMAIL PROTECTED]; Kaluskar, Sanjay
Subject: Re: UDF with dependency on external jars & native code
On 8/4/10 3:13 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]> wrote:
> The register isn't working after I made some changes to
mapred-site.xml.
> Right now I am executing PIG script from the command-line as follows:
>
Do you know what change in mapred-site.xml caused it to stop working ?
Is it after adding mapred.cache.archives ?
> PigudfException is an exception defined in one of the jars on the
> classpath of infapig.jar.
Is PigudfException also packaged within the jar ?
-Thejas
> -----Original Message-----
> From: Thejas M Nair [mailto:[EMAIL PROTECTED]
<mailto:[EMAIL PROTECTED]> ]
> Sent: Thursday, July 29, 2010 11:16 PM
> To: [EMAIL PROTECTED]; Kaluskar, Sanjay
> Subject: Re: UDF with dependency on external jars & native code
>
> You can use the MR distributed cache to push the native libs - see -
>
http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html#Dist<
http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html#Dist>> ri
> bute
> dCache
>
> "The DistributedCache can also be used to distribute both jars and
> native libraries for use in the map and/or reduce tasks. The
> child-jvm always has its current working directory added to the
> java.library.path and LD_LIBRARY_PATH. And hence the cached libraries
can be loaded via
> System.loadLibrary or System.load . More details on how to load
> shared
> libraries through distributed cache are documented at
> native_libraries.htm"
>
> So using -Dmapred.cache.files=<dfs path to file>, in your pig
> commandline should work.
>
> Please let us know if this worked for you.
>
> For the jars, you can also use a commandline option -
> -Dpig.additional.jars="jar1:jar2.."
>
> (thanks to Pradeep for suggesting this solution)
>
> Thanks,
> Thejas
>
> On 7/26/10 9:38 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]>
> wrote:
>
>> I am new to PIG and running into a fairly basic problem. I have a UDF
>> which depends on some other 3rd party jars & libraries. I can call
>> the UDF from my PIG script either from grunt or by running "java -cp
...
>> org.apache.pig.Main <script>" in local mode, when I have the jars on