|
|
-
Basic question on HDFS - MR
Stuti Awasthi 2011-10-18, 11:03
Hi, I have a very basic question on MR jobs. Suppose I have a cluster 3 nodes out of which 1 NN and 3 DN. I wanted to run an MR job so I placed the jar on 1 of the cluster machine and executed, it runs fine. Now my question is do I need to copy my MR job to every DN for distributed processing? Or Placing MR jar on any of the machine does not make any difference , MR job will run in distributed fashion.
Thanks Stuti
________________________________ ::DISCLAIMER:: -----------------------------------------------------------------------------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
-----------------------------------------------------------------------------------------------------------------------
-
Re: Basic question on HDFS - MR
Bejoy KS 2011-10-18, 11:17
Hi Stuti You don't need anything manually to do the distribution of your jar across Task Trackers(DN). You place your jar in some dir in LFS specify your jar path in the hadoop jar command then hadoop internally copies the jar to all the required task trackers. Also you can place the jar in any location job tracker would distribute it across TT.
Hope it helps
Regards Bejoy.K.S
On Tue, Oct 18, 2011 at 4:33 PM, Stuti Awasthi <[EMAIL PROTECTED]> wrote:
> Hi,**** > > I have a very basic question on MR jobs. Suppose I have a cluster 3 nodes > out of which 1 NN and 3 DN.**** > > I wanted to run an MR job so I placed the jar on 1 of the cluster machine > and executed, it runs fine.**** > > Now my question is do I need to copy my MR job to every DN for distributed > processing?**** > > Or**** > > Placing MR jar on any of the machine does not make any difference , MR job > will run in distributed fashion.**** > > ** ** > > Thanks**** > > Stuti**** > > ------------------------------ > ::DISCLAIMER:: > > ----------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > It shall not attach any liability on the originator or HCL or its > affiliates. Any views or opinions presented in > this email are solely those of the author and may not necessarily reflect > the opinions of HCL or its affiliates. > Any form of reproduction, dissemination, copying, disclosure, modification, > distribution and / or publication of > this message without the prior written consent of the author of this e-mail > is strictly prohibited. If you have > received this email in error please delete it and notify the sender > immediately. Before opening any mail and > attachments please check them for viruses and defect. > > > ----------------------------------------------------------------------------------------------------------------------- >
-
Re: Basic question on HDFS - MR
Uma Maheswara Rao G 72686... 2011-10-18, 11:20
You did not tell about the TaskTrackers....
We need not place the job jar explicitly on any node. When the JT assigns a job to a particular TT..all the resources required for the job are copied to the TT from HDFS.Just need to make sure that the newly added TT is registered with the JT regards, Uma
----- Original Message ----- From: Stuti Awasthi <[EMAIL PROTECTED]> Date: Tuesday, October 18, 2011 4:35 pm Subject: Basic question on HDFS - MR To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Hi, > I have a very basic question on MR jobs. Suppose I have a cluster > 3 nodes out of which 1 NN and 3 DN. > I wanted to run an MR job so I placed the jar on 1 of the cluster > machine and executed, it runs fine. > Now my question is do I need to copy my MR job to every DN for > distributed processing? > Or > Placing MR jar on any of the machine does not make any difference > , MR job will run in distributed fashion. > > Thanks > Stuti > > ________________________________ > ::DISCLAIMER:: > ------------------------------------------------------------------- > ---------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential > and intended for the named recipient(s) only. > It shall not attach any liability on the originator or HCL or its > affiliates. Any views or opinions presented in > this email are solely those of the author and may not necessarily > reflect the opinions of HCL or its affiliates. > Any form of reproduction, dissemination, copying, disclosure, > modification, distribution and / or publication of > this message without the prior written consent of the author of > this e-mail is strictly prohibited. If you have > received this email in error please delete it and notify the > sender immediately. Before opening any mail and > attachments please check them for viruses and defect. > > ------------------------------------------------------------------- > ---------------------------------------------------- >
-
RE: Basic question on HDFS - MR
Stuti Awasthi 2011-10-18, 11:26
Thanks Bejoy, Uma This clears my doubt . From: Bejoy KS [mailto:[EMAIL PROTECTED]] Sent: Tuesday, October 18, 2011 4:47 PM To: [EMAIL PROTECTED] Subject: Re: Basic question on HDFS - MR
Hi Stuti You don't need anything manually to do the distribution of your jar across Task Trackers(DN). You place your jar in some dir in LFS specify your jar path in the hadoop jar command then hadoop internally copies the jar to all the required task trackers. Also you can place the jar in any location job tracker would distribute it across TT.
Hope it helps
Regards Bejoy.K.S On Tue, Oct 18, 2011 at 4:33 PM, Stuti Awasthi <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote: Hi, I have a very basic question on MR jobs. Suppose I have a cluster 3 nodes out of which 1 NN and 3 DN. I wanted to run an MR job so I placed the jar on 1 of the cluster machine and executed, it runs fine. Now my question is do I need to copy my MR job to every DN for distributed processing? Or Placing MR jar on any of the machine does not make any difference , MR job will run in distributed fashion.
Thanks Stuti
________________________________ ::DISCLAIMER:: -----------------------------------------------------------------------------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
-----------------------------------------------------------------------------------------------------------------------
|
|