|
|
Mohit Anchlia 2012-03-02, 00:29
Is this the right procedure to add nodes? I took some from hadoop wiki FAQ: http://wiki.apache.org/hadoop/FAQ1. Update conf/slave 2. on the slave nodes start datanode and tasktracker 3. hadoop balancer Do I also need to run dfsadmin -refreshnodes?
+
Mohit Anchlia 2012-03-02, 00:29
Joey Echeverria 2012-03-02, 00:46
You only have to refresh nodes if you're making use of an allows file. Sent from my iPhone On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > Is this the right procedure to add nodes? I took some from hadoop wiki FAQ: > > http://wiki.apache.org/hadoop/FAQ> > 1. Update conf/slave > 2. on the slave nodes start datanode and tasktracker > 3. hadoop balancer > > Do I also need to run dfsadmin -refreshnodes?
+
Joey Echeverria 2012-03-02, 00:46
Mohit Anchlia 2012-03-02, 00:49
On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote: > You only have to refresh nodes if you're making use of an allows file. > > Thanks does it mean that when tasktracker/datanode starts up it communicates with namenode using master file? Sent from my iPhone > > On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > Is this the right procedure to add nodes? I took some from hadoop wiki > FAQ: > > > > http://wiki.apache.org/hadoop/FAQ> > > > 1. Update conf/slave > > 2. on the slave nodes start datanode and tasktracker > > 3. hadoop balancer > > > > Do I also need to run dfsadmin -refreshnodes? >
+
Mohit Anchlia 2012-03-02, 00:49
Joey Echeverria 2012-03-02, 00:57
Not quite. Datanodes get the namenode host from fs.defalt.name in core-site.xml. Task trackers find the job tracker from the mapred.job.tracker setting in mapred-site.xml. Sent from my iPhone On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote: > >> You only have to refresh nodes if you're making use of an allows file. >> >> Thanks does it mean that when tasktracker/datanode starts up it > communicates with namenode using master file? > > Sent from my iPhone >> >> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: >> >>> Is this the right procedure to add nodes? I took some from hadoop wiki >> FAQ: >>> >>> http://wiki.apache.org/hadoop/FAQ>>> >>> 1. Update conf/slave >>> 2. on the slave nodes start datanode and tasktracker >>> 3. hadoop balancer >>> >>> Do I also need to run dfsadmin -refreshnodes? >>
+
Joey Echeverria 2012-03-02, 00:57
Mohit Anchlia 2012-03-02, 01:35
On Thu, Mar 1, 2012 at 4:57 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote: > Not quite. Datanodes get the namenode host from fs.defalt.name in > core-site.xml. Task trackers find the job tracker from the > mapred.job.tracker setting in mapred-site.xml. > I actually meant to ask how does namenode/jobtracker know there is a new node in the cluster. Is it initiated by namenode when slave file is edited? Or is it initiated by tasktracker when tasktracker is started? > > Sent from my iPhone > > On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> > wrote: > > > >> You only have to refresh nodes if you're making use of an allows file. > >> > >> Thanks does it mean that when tasktracker/datanode starts up it > > communicates with namenode using master file? > > > > Sent from my iPhone > >> > >> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > >> > >>> Is this the right procedure to add nodes? I took some from hadoop wiki > >> FAQ: > >>> > >>> http://wiki.apache.org/hadoop/FAQ> >>> > >>> 1. Update conf/slave > >>> 2. on the slave nodes start datanode and tasktracker > >>> 3. hadoop balancer > >>> > >>> Do I also need to run dfsadmin -refreshnodes? > >> >
+
Mohit Anchlia 2012-03-02, 01:35
George Datskos 2012-03-02, 01:58
Mohit,
New datanodes will connect to the namenode so thats how the namenode knows. Just make sure the datanodes have the correct {fs.default.dir} in their hdfs-site.xml and then start them. The namenode can, however, choose to reject the datanode if you are using the {dfs.hosts} and {dfs.hosts.exclude} settings in the namenode's hdfs-site.xml.
The namenode doesn't actually care about the slaves file. It's only used by the start/stop scripts. On 2012/03/02 10:35, Mohit Anchlia wrote: > I actually meant to ask how does namenode/jobtracker know there is a new > node in the cluster. Is it initiated by namenode when slave file is edited? > Or is it initiated by tasktracker when tasktracker is started?
+
George Datskos 2012-03-02, 01:58
Arpit Gupta 2012-03-02, 01:52
It is initiated by the slave. If you have defined files to state which slaves can talk to the namenode (using config dfs.hosts) and which hosts cannot (using property dfs.hosts.exclude) then you would need to edit these files and issue the refresh command. On Mar 1, 2012, at 5:35 PM, Mohit Anchlia wrote: > On Thu, Mar 1, 2012 at 4:57 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote: > >> Not quite. Datanodes get the namenode host from fs.defalt.name in >> core-site.xml. Task trackers find the job tracker from the >> mapred.job.tracker setting in mapred-site.xml. >> > > I actually meant to ask how does namenode/jobtracker know there is a new > node in the cluster. Is it initiated by namenode when slave file is edited? > Or is it initiated by tasktracker when tasktracker is started? > >> >> Sent from my iPhone >> >> On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: >> >>> On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> >> wrote: >>> >>>> You only have to refresh nodes if you're making use of an allows file. >>>> >>>> Thanks does it mean that when tasktracker/datanode starts up it >>> communicates with namenode using master file? >>> >>> Sent from my iPhone >>>> >>>> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: >>>> >>>>> Is this the right procedure to add nodes? I took some from hadoop wiki >>>> FAQ: >>>>> >>>>> http://wiki.apache.org/hadoop/FAQ>>>>> >>>>> 1. Update conf/slave >>>>> 2. on the slave nodes start datanode and tasktracker >>>>> 3. hadoop balancer >>>>> >>>>> Do I also need to run dfsadmin -refreshnodes? >>>> >> -- Arpit Hortonworks, Inc. email: [EMAIL PROTECTED]
+
Arpit Gupta 2012-03-02, 01:52
Mohit Anchlia 2012-03-02, 01:59
Thanks all for the answers!! On Thu, Mar 1, 2012 at 5:52 PM, Arpit Gupta <[EMAIL PROTECTED]> wrote: > It is initiated by the slave. > > If you have defined files to state which slaves can talk to the namenode > (using config dfs.hosts) and which hosts cannot (using > property dfs.hosts.exclude) then you would need to edit these files and > issue the refresh command. > > > On Mar 1, 2012, at 5:35 PM, Mohit Anchlia wrote: > > On Thu, Mar 1, 2012 at 4:57 PM, Joey Echeverria <[EMAIL PROTECTED]> > wrote: > > Not quite. Datanodes get the namenode host from fs.defalt.name in > > core-site.xml. Task trackers find the job tracker from the > > mapred.job.tracker setting in mapred-site.xml. > > > > I actually meant to ask how does namenode/jobtracker know there is a new > node in the cluster. Is it initiated by namenode when slave file is edited? > Or is it initiated by tasktracker when tasktracker is started? > > > Sent from my iPhone > > > On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> > > wrote: > > > You only have to refresh nodes if you're making use of an allows file. > > > Thanks does it mean that when tasktracker/datanode starts up it > > communicates with namenode using master file? > > > Sent from my iPhone > > > On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > Is this the right procedure to add nodes? I took some from hadoop wiki > > FAQ: > > > http://wiki.apache.org/hadoop/FAQ> > > 1. Update conf/slave > > 2. on the slave nodes start datanode and tasktracker > > 3. hadoop balancer > > > Do I also need to run dfsadmin -refreshnodes? > > > > > > -- > Arpit > Hortonworks, Inc. > email: [EMAIL PROTECTED] > > < http://www.hadoopsummit.org/>> < http://www.hadoopsummit.org/>> < http://www.hadoopsummit.org/>>
+
Mohit Anchlia 2012-03-02, 01:59
Raj Vishwanathan 2012-03-02, 01:10
The master and slave files, if I remember correctly are used to start the correct daemons on the correct nodes from the master node. Raj >________________________________ > From: Joey Echeverria <[EMAIL PROTECTED]> >To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >Sent: Thursday, March 1, 2012 4:57 PM >Subject: Re: Adding nodes > >Not quite. Datanodes get the namenode host from fs.defalt.name in core-site.xml. Task trackers find the job tracker from the mapred.job.tracker setting in mapred-site.xml. > >Sent from my iPhone > >On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > >> On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> wrote: >> >>> You only have to refresh nodes if you're making use of an allows file. >>> >>> Thanks does it mean that when tasktracker/datanode starts up it >> communicates with namenode using master file? >> >> Sent from my iPhone >>> >>> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> wrote: >>> >>>> Is this the right procedure to add nodes? I took some from hadoop wiki >>> FAQ: >>>> >>>> http://wiki.apache.org/hadoop/FAQ>>>> >>>> 1. Update conf/slave >>>> 2. on the slave nodes start datanode and tasktracker >>>> 3. hadoop balancer >>>> >>>> Do I also need to run dfsadmin -refreshnodes? >>> > > >
+
Raj Vishwanathan 2012-03-02, 01:10
anil gupta 2012-03-02, 01:42
Whatever Joey said is correct for Cloudera's distribution. For same, I am not confident about other distribution as i haven't tried them. Thanks, Anil On Thu, Mar 1, 2012 at 5:10 PM, Raj Vishwanathan <[EMAIL PROTECTED]> wrote: > The master and slave files, if I remember correctly are used to start the > correct daemons on the correct nodes from the master node. > > > Raj > > > >________________________________ > > From: Joey Echeverria <[EMAIL PROTECTED]> > >To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > >Sent: Thursday, March 1, 2012 4:57 PM > >Subject: Re: Adding nodes > > > >Not quite. Datanodes get the namenode host from fs.defalt.name in > core-site.xml. Task trackers find the job tracker from the > mapred.job.tracker setting in mapred-site.xml. > > > >Sent from my iPhone > > > >On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > > > >> On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> > wrote: > >> > >>> You only have to refresh nodes if you're making use of an allows file. > >>> > >>> Thanks does it mean that when tasktracker/datanode starts up it > >> communicates with namenode using master file? > >> > >> Sent from my iPhone > >>> > >>> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> > wrote: > >>> > >>>> Is this the right procedure to add nodes? I took some from hadoop wiki > >>> FAQ: > >>>> > >>>> http://wiki.apache.org/hadoop/FAQ> >>>> > >>>> 1. Update conf/slave > >>>> 2. on the slave nodes start datanode and tasktracker > >>>> 3. hadoop balancer > >>>> > >>>> Do I also need to run dfsadmin -refreshnodes? > >>> > > > > > > > -- Thanks & Regards, Anil Gupta
+
anil gupta 2012-03-02, 01:42
Raj Vishwanathan 2012-03-02, 01:51
WHat Joey said is correct for both apache and cloudera distros. The DN/TT daemons will connect to the NN/JT using the config files. The master and slave files are used for starting the correct daemons. >________________________________ > From: anil gupta <[EMAIL PROTECTED]> >To: [EMAIL PROTECTED]; Raj Vishwanathan <[EMAIL PROTECTED]> >Sent: Thursday, March 1, 2012 5:42 PM >Subject: Re: Adding nodes > >Whatever Joey said is correct for Cloudera's distribution. For same, I am >not confident about other distribution as i haven't tried them. > >Thanks, >Anil > >On Thu, Mar 1, 2012 at 5:10 PM, Raj Vishwanathan <[EMAIL PROTECTED]> wrote: > >> The master and slave files, if I remember correctly are used to start the >> correct daemons on the correct nodes from the master node. >> >> >> Raj >> >> >> >________________________________ >> > From: Joey Echeverria <[EMAIL PROTECTED]> >> >To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> >Cc: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> >> >Sent: Thursday, March 1, 2012 4:57 PM >> >Subject: Re: Adding nodes >> > >> >Not quite. Datanodes get the namenode host from fs.defalt.name in >> core-site.xml. Task trackers find the job tracker from the >> mapred.job.tracker setting in mapred-site.xml. >> > >> >Sent from my iPhone >> > >> >On Mar 1, 2012, at 18:49, Mohit Anchlia <[EMAIL PROTECTED]> wrote: >> > >> >> On Thu, Mar 1, 2012 at 4:46 PM, Joey Echeverria <[EMAIL PROTECTED]> >> wrote: >> >> >> >>> You only have to refresh nodes if you're making use of an allows file. >> >>> >> >>> Thanks does it mean that when tasktracker/datanode starts up it >> >> communicates with namenode using master file? >> >> >> >> Sent from my iPhone >> >>> >> >>> On Mar 1, 2012, at 18:29, Mohit Anchlia <[EMAIL PROTECTED]> >> wrote: >> >>> >> >>>> Is this the right procedure to add nodes? I took some from hadoop wiki >> >>> FAQ: >> >>>> >> >>>> http://wiki.apache.org/hadoop/FAQ>> >>>> >> >>>> 1. Update conf/slave >> >>>> 2. on the slave nodes start datanode and tasktracker >> >>>> 3. hadoop balancer >> >>>> >> >>>> Do I also need to run dfsadmin -refreshnodes? >> >>> >> > >> > >> > >> > > > >-- >Thanks & Regards, >Anil Gupta > > >
+
Raj Vishwanathan 2012-03-02, 01:51
|
|