Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Problem running a Hadoop program with external libraries


Copy link to this message
-
Re: Problem running a Hadoop program with external libraries
Actually, I just misread your email and missed the difference between your
2nd and 3rd attempts.

Are you enforcing min/max JVM heap sizes on your tasks? Are you enforcing a
ulimit (either through your shell configuration, or through Hadoop itself)?
I don't know where these "cannot allocate memory" errors are coming from. If
they're from the OS, could it be because it needs to fork() and momentarily
exceed the ulimit before loading the native libs?

- Aaron

On Fri, Mar 4, 2011 at 1:26 PM, Aaron Kimball <[EMAIL PROTECTED]> wrote:

> I don't know if putting native-code .so files inside a jar works. A
> native-code .so is not "classloaded" in the same way .class files are.
>
> So the correct .so files probably need to exist in some physical directory
> on the worker machines. You may want to doublecheck that the correct
> directory on the worker machines is identified in the JVM property
> 'java.library.path' (instead of / in addition to $LD_LIBRARY_PATH). This can
> be manipulated in the Hadoop configuration setting mapred.child.java.opts
> (include '-Djava.library.path=/path/to/native/libs' in the string there.)
>
> Also, if you added your .so files to a directory that is already used by
> the tasktracker (like hadoop-0.21.0/lib/native/Linux-amd64-64/), you may
> need to restart the tasktracker instance for it to take effect. (This is
> true of .jar files in the $HADOOP_HOME/lib directory; I don't know if it is
> true for native libs as well.)
>
> - Aaron
>
>
> On Fri, Mar 4, 2011 at 12:53 PM, Ratner, Alan S (IS) <[EMAIL PROTECTED]>wrote:
>
>> We are having difficulties running a Hadoop program making calls to
>> external libraries - but this occurs only when we run the program on our
>> cluster and not from within Eclipse where we are apparently running in
>> Hadoop's standalone mode.  This program invokes the Open Computer Vision
>> libraries (OpenCV and JavaCV).  (I don't think there is a problem with our
>> cluster - we've run many Hadoop jobs on it without difficulty.)
>>
>> 1.      I normally use Eclipse to create jar files for our Hadoop programs
>> but I inadvertently hit the "run as Java application" button and the program
>> ran fine, reading the input file from the eclipse workspace rather than HDFS
>> and writing the output file to the same place.  Hadoop's output appears
>> below.  (This occurred on the master Hadoop server.)
>>
>> 2.      I then "exported" from Eclipse a "runnable jar" which "extracted
>> required libraries" into the generated jar - presumably producing a jar file
>> that incorporated all the required library functions. (The plain jar file
>> for this program is 17 kB while the runnable jar is 30MB.)  When I try to
>> run this on my Hadoop cluster (including my master and slave servers) the
>> program reports that it is unable to locate "libopencv_highgui.so.2.2:
>> cannot open shared object file: No such file or directory".  Now, in
>> addition to this library being incorporated inside the runnable jar file it
>> is also present on each of my servers at
>> hadoop-0.21.0/lib/native/Linux-amd64-64/ where we have loaded the same
>> libraries (to give Hadoop 2 shots at finding them).  These include:
>>      ...
>>      libopencv_highgui_pch_dephelp.a
>>      libopencv_highgui.so
>>      libopencv_highgui.so.2.2
>>      libopencv_highgui.so.2.2.0
>>      ...
>>
>>      When I poke around inside the runnable jar I find
>> javacv_linux-x86_64.jar which contains:
>>      com/googlecode/javacv/cpp/linux-x86_64/libjniopencv_highgui.so
>>
>> 3.      I then tried adding the following to mapred-site.xml as suggested
>> in Patch 2838 that's supposed to be included in hadoop 0.21
>> https://issues.apache.org/jira/browse/HADOOP-2838
>>      <property>
>>        <name>mapred.child.env</name>
>>
>>  <value>LD_LIBRARY_PATH=/home/ngc/hadoop-0.21.0/lib/native/Linux-amd64-64</value>
>>      </property>
>>      The log is included at the bottom of this email with Hadoop now
>> complaining about a different missing library with an out-of-memory error.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB