A patch was submitted for topology documentation, but it doesn't appear to have made it to any releases. This svn link may help starting at line 1294.
Assuming you are using hadoop 1.x and not yarn, the topology script only needs to be on the namenode and jobtracker. As you have noticed it doesn't hurt anything if you copy the script everywhere as the tasktracker and datanode process will ignore it. Try looking at pdsh for controlling compute nodes and pushing files, but be careful as if you type a bad command it's going to get ran everywhere. http://code.google.com/p/pdsh/
On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote:
> The documentation on topology conf (topology.script.file.name) is a little sparse, and while we have it working in our cluster I am trying to make it a little easier to configure.
> Currently we upload a python file and conf file to every node in our cluster. However I have a feeling that it is only needed on the NameNode(s) and perhaps JobTracker. I checked the code for DataNode and see no reference to this configuration parameter, but I wanted to check with you all before I stop updating the conf on every one of my nodes.
> Can anyone confirm whether these configuration files only need to be present on the NameNode/JobTracker, or do they need to be on every node in a cluster?
Bryan Beaudreault 2013-01-11, 18:08