Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Re: services requiring topology conf


Copy link to this message
-
Re: services requiring topology conf
Adam Faris 2013-01-11, 17:15
A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases.  This svn link may help starting at line 1294.
 
http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker.  As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it.   Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/

-- Adam
On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote:

> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
>
> Currently we upload a python file and conf file to every node in our cluster.  However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker.  I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
>
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
>
> Thanks