Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Best practices configuring libraries on the backend.


+
Dmitriy Lyubimov 2012-03-28, 01:43
Copy link to this message
-
Re: Best practices configuring libraries on the backend.
Dmitriy,

To deal with different servers having various shared libraries in
different locations, you can simply make sure the _TaskTracker_'s
-Djava.library.path is set correctly on each server.  That library path
should be passed along to each child (M/R) task.  (in *addition* to the
{mapred.child.java.opts} that you specify on the client-side
configuration options)

One caveat: on the client-side, don't include "-Djava.library.path" or
that path will be passed along to all of the child tasks, overriding
site-specific one you set on the TaskTracker.
George
On 2012/03/28 10:43, Dmitriy Lyubimov wrote:
> Hello,
>
> I have a couple of questions regarding mapreduce configurations.
>
> We install various platforms on data nodes that require mixed set of
> native libraries.
>
> Part of the problem is that in general case, this software platforms
> may be installed into different locations in the backend. (we try to
> unify it, but still). What it means, it may require site-specific
> -Djava.library.path setting.
>
> I configured individual jvm options (mapred.child.java.opts) on each
> node to include specific set of paths. However, i encountered 2
> problems:
>
> #1: my setting doesn't go into effect unless I also declare it final
> in the data node. It's just being overriden by default -Xmx200 value
> from the driver  EVEN when i don't set it on the driver at all (and
> there seems to be no way to unset it).
>
> However, using "final" spec at the backend creates  a problem if some
> of numerous jobs we run wishes to override the setting still. The
> ideal behavior is if i don't set it in the driver, then backend value
> kicks in, otherwise it's driver's value. But i did not find a way to
> do that for this particular setting for some reason.Could somebody
> clarify the best workaround? thank you.
>
> #2. Ideal behavior would actually be to merge driver-specific and
> backend-specific settings. E.g. backend may need to configure specific
> software package locations while client may wish sometimes to set heap
> etc. Is there a best practice to achieve this effect?
>
> Thank you very much in advance.
> -Dmitriy
>
>
+
George Datskos 2012-03-28, 02:17
+
Dmitriy Lyubimov 2012-03-28, 03:08
+
Bharath Mundlapudi 2012-03-28, 12:14
+
Dmitriy Lyubimov 2012-03-28, 20:19
+
George Datskos 2012-03-29, 00:04
+
Harsh J 2012-03-29, 04:57
+
Dmitriy Lyubimov 2012-03-30, 18:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB