Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Problem running a Hadoop program with external libraries


Copy link to this message
-
Re: Problem running a Hadoop program with external libraries
Allen Wittenauer 2011-03-12, 01:03

On Mar 8, 2011, at 1:21 PM, Ratner, Alan S (IS) wrote:
> We had tried putting all the libraries directly in HDFS with a pointer in mapred-site.xml:
> <property><name>mapred.child.env</name><value>LD_LIBRARY_PATH=/user/ngc/lib</value></property>
> as described in https://issues.apache.org/jira/browse/HADOOP-2838 but this did not work for us.

Correct.  This isn't expected to work.

HDFS files are not directly accessible from the shell without some sort of action having taken place.   In order for the above to work, anything reading the LD_LIBRARY_PATH environment variable would have to know that '/user/...' is a) inside HDFS and b) know how to access it.   The reason why the distributed cache method works is because it pulls files from HDFS and places them in the local UNIX file system.  From there, UNIX processes can now access them.

HADOOP-2838 is really about providing a way for applications to get to libraries that are already installed at the UNIX level.  (Although, in reality, it would likely be better if applications were linked with a better value provided for the runtime library search path -R/-rpath/ld.so.conf/crle/etc rather than using LD_LIBRARY_PATH.)