Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> question on data/task node specific configuration...

Copy link to this message
question on data/task node specific configuration...

    I've run into a situation where it would be helpful to set specific configuration variables local to a data/task node.  I've got a solution, but I'm curious if there is a best practice around this and if I'm doing it in a reasonable way.

    Basically what we've got is a number of machines that have 2 Cores/4GB of memory.  For those boxes we have some options configured higher than the default in mapred-site.xml ( specifically  mapred.child.java.opts set to -Xmx1024m ).  We recently added a few additional boxes that have 2 Cores/2GB of memory and the mapred.child.java.opts causes those boxes to swap so we'd like to configure those boxes to set mapred.child.java.opts to -Xmx512m.  What I found is that if on the data/task node that if I change the value in mapred-site.xml it is overridden, but if I set the parameter using <final>true</final> it is used.  So effectively now I've got a different mapred-site.xml on each of the nodes.

   My question is, is this a reasonable way of going about this?  Is there a best practice for dealing with minor node-specific configuration differences?  What I'd really want is not only a mapred-site.xml, but a mapred-node.xml as well that has node specific overrides.  What I can't tell is if I make all the site level configuration changes in the name node mapred-site.xml and then the node specific mapred-site.xml files to node local changes if that does what I'm looking for.

   Any insight would be appreciated.