|
|
-
understanding hadoop job submission
Arindam Choudhury 2012-04-25, 08:44
Hi,
I am new to hadoop and I am trying to understand hadoop job submission.
We submit the job using:
hadoop jar some.jar name input output
this in turn invoke the RunJar . But in RunJar I can not find any JobSubmit() or any call to JobClient.
Then, how the job gets submitted to the JobTracker?
-Arindam
-
RE: understanding hadoop job submission
Devaraj k 2012-04-25, 08:56
Hi Arindam,
hadoop jar jarFileName MainClassName
The above command will not submit the job. This command only executes the jar file using the Main Class(Main-class present in manifest info if available otherwise class name(i.e MainClassName in the above command) passed as an argument. If we give any additional arguments in the command, those will be passed to the Main class args.
We can have a job submission code in the Main Class or any of the classes in the jar file. You can take a look into WordCount example for job submission info. Thanks Devaraj
________________________________________ From: Arindam Choudhury [[EMAIL PROTECTED]] Sent: Wednesday, April 25, 2012 2:14 PM To: common-user Subject: understanding hadoop job submission
Hi,
I am new to hadoop and I am trying to understand hadoop job submission.
We submit the job using:
hadoop jar some.jar name input output
this in turn invoke the RunJar . But in RunJar I can not find any JobSubmit() or any call to JobClient.
Then, how the job gets submitted to the JobTracker?
-Arindam
-
Re: understanding hadoop job submission
Jay Vyas 2012-04-25, 09:41
Yes, the job is submitted by the api calls in map reduce code
On Wed, Apr 25, 2012 at 3:56 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Hi Arindam, > > hadoop jar jarFileName MainClassName > > The above command will not submit the job. This command only executes the > jar file using the Main Class(Main-class present in manifest info if > available otherwise class name(i.e MainClassName in the above command) > passed as an argument. If we give any additional arguments in the command, > those will be passed to the Main class args. > > We can have a job submission code in the Main Class or any of the > classes in the jar file. You can take a look into WordCount example for job > submission info. > > > Thanks > Devaraj > > ________________________________________ > From: Arindam Choudhury [[EMAIL PROTECTED]] > Sent: Wednesday, April 25, 2012 2:14 PM > To: common-user > Subject: understanding hadoop job submission > > Hi, > > I am new to hadoop and I am trying to understand hadoop job submission. > > We submit the job using: > > hadoop jar some.jar name input output > > this in turn invoke the RunJar . But in RunJar I can not find any > JobSubmit() or any call to JobClient. > > Then, how the job gets submitted to the JobTracker? > > -Arindam >
-- Jay Vyas MMSB/UCHC
-
Re: understanding hadoop job submission
Arindam Choudhury 2012-04-25, 09:57
Hi,
The code is:
public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }
I understand it now. But, is it possible to write a program using the JobClient to submit the hadoop job?
To do that I have to create a JobConf manually. Am I thinking right?
Arindam
On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Hi Arindam, > > hadoop jar jarFileName MainClassName > > The above command will not submit the job. This command only executes the > jar file using the Main Class(Main-class present in manifest info if > available otherwise class name(i.e MainClassName in the above command) > passed as an argument. If we give any additional arguments in the command, > those will be passed to the Main class args. > > We can have a job submission code in the Main Class or any of the > classes in the jar file. You can take a look into WordCount example for job > submission info. > > > Thanks > Devaraj > > ________________________________________ > From: Arindam Choudhury [[EMAIL PROTECTED]] > Sent: Wednesday, April 25, 2012 2:14 PM > To: common-user > Subject: understanding hadoop job submission > > Hi, > > I am new to hadoop and I am trying to understand hadoop job submission. > > We submit the job using: > > hadoop jar some.jar name input output > > this in turn invoke the RunJar . But in RunJar I can not find any > JobSubmit() or any call to JobClient. > > Then, how the job gets submitted to the JobTracker? > > -Arindam >
-
RE: understanding hadoop job submission
Devaraj k 2012-04-25, 10:25
You can submit the job using any one of the below ways,
1. If you submit the job using JobClient, you need to create JobConf and submit the job using JobClient.runJob(JobConf conf) API.
2. Also you can submit the job by creating instance for Job by passing Configuration object and submit(using submit() or waitForCompletion()) as you mentioned in the below code. This case no need to create an instance for JobConf.
Thanks Devaraj
________________________________________ From: Arindam Choudhury [[EMAIL PROTECTED]] Sent: Wednesday, April 25, 2012 3:27 PM To: [EMAIL PROTECTED] Subject: Re: understanding hadoop job submission
Hi,
The code is:
public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); } Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }
I understand it now. But, is it possible to write a program using the JobClient to submit the hadoop job?
To do that I have to create a JobConf manually. Am I thinking right?
Arindam
On Wed, Apr 25, 2012 at 10:56 AM, Devaraj k <[EMAIL PROTECTED]> wrote:
> Hi Arindam, > > hadoop jar jarFileName MainClassName > > The above command will not submit the job. This command only executes the > jar file using the Main Class(Main-class present in manifest info if > available otherwise class name(i.e MainClassName in the above command) > passed as an argument. If we give any additional arguments in the command, > those will be passed to the Main class args. > > We can have a job submission code in the Main Class or any of the > classes in the jar file. You can take a look into WordCount example for job > submission info. > > > Thanks > Devaraj > > ________________________________________ > From: Arindam Choudhury [[EMAIL PROTECTED]] > Sent: Wednesday, April 25, 2012 2:14 PM > To: common-user > Subject: understanding hadoop job submission > > Hi, > > I am new to hadoop and I am trying to understand hadoop job submission. > > We submit the job using: > > hadoop jar some.jar name input output > > this in turn invoke the RunJar . But in RunJar I can not find any > JobSubmit() or any call to JobClient. > > Then, how the job gets submitted to the JobTracker? > > -Arindam >
|
|