Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # dev - How to run hadoop jar command in a clustered environment


+
Thoihen Maibam 2013-04-15, 16:36
+
Chris Nauroth 2013-04-15, 17:32
Copy link to this message
-
Re: How to run hadoop jar command in a clustered environment
maisnam ns 2013-04-15, 18:07
@Chris thanks a lot that helped a lot.
On Mon, Apr 15, 2013 at 11:02 PM, Chris Nauroth <[EMAIL PROTECTED]>wrote:

> Hello Thoihen,
>
> I'm moving this discussion from common-dev (questions about developing
> Hadoop) to user (questions about using Hadoop).
>
> If you haven't already seen it, then I recommend reading the cluster setup
> documentation.  It's a bit different depending on the version of the Hadoop
> code that you're deploying and running.  You mentioned JobTracker, so I
> expect that you're using something from the 1.x line, but here are links to
> both 1.x and 2.x docs just in case:
>
> 1.x: http://hadoop.apache.org/docs/r1.1.2/cluster_setup.html
> 2.x/trunk:
>
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html
>
> To address your specific questions:
>
> 1. You can run the hadoop jar command and submit MapReduce jobs from any
> machine that has the Hadoop software and configuration deployed and has
> network connectivity to the machines that make up the Hadoop cluster.
>
> 2. Yes, you can use a separate machine that is not a member of the cluster
> (meaning it does not run Hadoop daemons like DataNode, TaskTracker, or
> NodeManager).  This is your choice.  I've found it valuable to isolate
> nodes like this to prevent MR job tasks from taking processing resources
> away from interactive user commands, but this does mean that the resources
> on that node can't be utilized by MR jobs during user idle times, so it
> causes a small hit to overall utilization.
>
> Hope this helps,
> --Chris
>
>
> On Mon, Apr 15, 2013 at 9:36 AM, Thoihen Maibam <[EMAIL PROTECTED]
> >wrote:
>
> > Hi All,
> >
> > I am really new to Hadoop and installed hadoop in my local ubuntu
> machine.
> > I also created a wordcount.jar and started hadoop with start-all.sh which
> > started all the hadoop daemons and used jps to confirm it. Cd to
> hadoop/bin
> > and ran hadoop jar x.jar  and successfully ran the map reduce program.
> >
> > Now, can someone please help me how I should run the hadoop jar command
> > over a clustered environment say for example a cluster with 50 nodes. I
> > know a dedicated machine would be namenode and another jobtracker and
> other
> > datanodes and tasktrackers.
> >
> > 1. From which machine should I run the hadoop jar command considering I
> > have a mapreduce jar in hand. Is it the jobtracker machine from where I
> > should run this hadoop jar command or can I run this hadoop jar command
> > from any machine in the cluster.
> >
> > 2, Can I run the map reduce job from another machine which is not part of
> > the cluster , if yes how should I do it.
> >
> > Please help me.
> >
> > Regards
> > thoihen
> >
>