Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> UDF with dependency on external jars & native code

Copy link to this message
Re: UDF with dependency on external jars & native code

On 8/4/10 3:13 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]> wrote:

> The register isn't working after I made some changes to mapred-site.xml.
> Right now I am executing PIG script from the command-line as follows:

Do you know what change in mapred-site.xml caused it to stop working ? Is it
after adding mapred.cache.archives ?

> PigudfException is an exception defined in one of the jars on the
> classpath of infapig.jar.

Is PigudfException also packaged within the jar ?

> -----Original Message-----
> From: Thejas M Nair [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, July 29, 2010 11:16 PM
> To: [EMAIL PROTECTED]; Kaluskar, Sanjay
> Subject: Re: UDF with dependency on external jars & native code
> You can use the MR distributed cache to push the native libs - see -
> http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html#Distri
> bute
> dCache
> "The DistributedCache can also be used to distribute both jars and
> native libraries for use in the map  and/or reduce tasks. The child-jvm
> always has its  current working directory added to the java.library.path
> and LD_LIBRARY_PATH.  And hence the cached libraries can be loaded via
> System.loadLibrary or   System.load  . More details on how to load
> shared
> libraries through  distributed cache are documented at
> native_libraries.htm"
> So using -Dmapred.cache.files=<dfs path to file>, in your pig
> commandline should work.
> Please let us know if this worked for you.
> For the jars, you can also use a commandline option -
> -Dpig.additional.jars="jar1:jar2.."
> (thanks to Pradeep for suggesting this solution)
> Thanks,
> Thejas
> On 7/26/10 9:38 AM, "Kaluskar, Sanjay" <[EMAIL PROTECTED]>
> wrote:
>> I am new to PIG and running into a fairly basic problem. I have a UDF
>> which depends on some other 3rd party jars & libraries. I can call the
>> UDF from my PIG script either from grunt or by running "java -cp ...
>> org.apache.pig.Main <script>" in local mode, when I have the jars on
>> the classpath and the libraries on LD_LIBRARY_PATH. But, in mapreduce
>> mode I get errors from Hadoop because it doesn't find the classes &
> libraries.
>> I saw another thread on this forum, which had a workaround for the
> jar.
>> I can explicitly call register on the dependency, and that seems to
>> fix the problem. But, there doesn't seem to be a way of specifying the
>> native libraries to PIG such that the map/reduce jobs are set up to
>> access them.
>> I am using PIG 0.5.0. Any help is appreciated!
>> Thanks,
>> -sanjay