You don't want users actually running anything directly on the cluster.
You would set up some machine to launch jobs.
Essentially any sort of Linux machine where you can install Hadoop, but you don't run any jobs...
Sent from my iPhone
On Mar 28, 2012, at 3:30 AM, "Jane Wayne" <[EMAIL PROTECTED]> wrote:
> what do you mean by an edge node? do you mean any node that is not the
> master node (or NameNode or JobTracker node)?
> On Wed, Mar 28, 2012 at 3:51 AM, Michel Segel <[EMAIL PROTECTED]>wrote:
>> First you really don't want to launch the job from the cluster but from an
>> edge node.
>> To answer your question, in a word, yes, you should have a consistent set
>> of configuration files as possible, noting that overtime this may not be
>> possible as hardware configs may change,
>> Sent from a remote device. Please excuse any typos...
>> Mike Segel
>> On Mar 27, 2012, at 8:42 PM, Jane Wayne <[EMAIL PROTECTED]> wrote:
>>> if i have a hadoop cluster of 10 nodes, do i have to modify the
>>> /hadoop/conf/log4j.properties files on ALL 10 nodes to be the same?
>>> currently, i ssh into the master node to execute a job. this node is the
>>> only place where i have modified the logj4.properties file. i notice that
>>> although my log files are being created, nothing is being written to
>>> when i test on cygwin, the logging works, however, when i go to a live
>>> cluster (i.e. amazon elastic mapreduce), the logging output on the master
>>> node no longer works. i wonder if logging is happening at each slave/task
>>> could someone explain logging or point me to the documentation discussing
>>> this issue?