Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Best practices configuring libraries on the backend.

Copy link to this message
Re: Best practices configuring libraries on the backend.

I just double-checked, and the caveat I stated earlier is incorrect.  
So,  "-Djava.library.path" set in the client's {mapred.child.java.opts}
should just append to to the "-Djava.library.path" that each TaskTracker
has when creating the library path for each child (M/R) task.  So that's
even better I guess.
On 2012/03/28 11:06, George Datskos wrote:
> Dmitriy,
> To deal with different servers having various shared libraries in
> different locations, you can simply make sure the _TaskTracker_'s
> -Djava.library.path is set correctly on each server.  That library
> path should be passed along to each child (M/R) task.  (in *addition*
> to the {mapred.child.java.opts} that you specify on the client-side
> configuration options)
> One caveat: on the client-side, don't include "-Djava.library.path" or
> that path will be passed along to all of the child tasks, overriding
> site-specific one you set on the TaskTracker.
> George
> On 2012/03/28 10:43, Dmitriy Lyubimov wrote:
>> Hello,
>> I have a couple of questions regarding mapreduce configurations.
>> We install various platforms on data nodes that require mixed set of
>> native libraries.
>> Part of the problem is that in general case, this software platforms
>> may be installed into different locations in the backend. (we try to
>> unify it, but still). What it means, it may require site-specific
>> -Djava.library.path setting.
>> I configured individual jvm options (mapred.child.java.opts) on each
>> node to include specific set of paths. However, i encountered 2
>> problems:
>> #1: my setting doesn't go into effect unless I also declare it final
>> in the data node. It's just being overriden by default -Xmx200 value
>> from the driver  EVEN when i don't set it on the driver at all (and
>> there seems to be no way to unset it).
>> However, using "final" spec at the backend creates  a problem if some
>> of numerous jobs we run wishes to override the setting still. The
>> ideal behavior is if i don't set it in the driver, then backend value
>> kicks in, otherwise it's driver's value. But i did not find a way to
>> do that for this particular setting for some reason.Could somebody
>> clarify the best workaround? thank you.
>> #2. Ideal behavior would actually be to merge driver-specific and
>> backend-specific settings. E.g. backend may need to configure specific
>> software package locations while client may wish sometimes to set heap
>> etc. Is there a best practice to achieve this effect?
>> Thank you very much in advance.
>> -Dmitriy