|
|
-
How do I get a JobStatus object?
Aaron Baff 2011-02-16, 18:39
I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting until it completes with RunningJob.waitForCompletion(). I then want to get how long the entire MR takes, which appears to need the JobStatus since RunningJob doesn't provide anything I can use for that. The only way I can see how to do it right now is JobClient.getAllJobs(), which gives me an array of all the jobs that are submitted (currently running? all previous?). Anyone know how I could go about doing this?
--Aaron
-
Re: How do I get a JobStatus object?
madhu phatak 2011-02-17, 06:34
Rather than running jobs by wait for completion you can use jobcontrol to control the jobs . JobControl give access to the what all jobs are completed ,running and failed etc
On Thu, Feb 17, 2011 at 12:09 AM, Aaron Baff <[EMAIL PROTECTED]>wrote:
> I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting > until it completes with RunningJob.waitForCompletion(). I then want to get > how long the entire MR takes, which appears to need the JobStatus since > RunningJob doesn't provide anything I can use for that. The only way I can > see how to do it right now is JobClient.getAllJobs(), which gives me an > array of all the jobs that are submitted (currently running? all previous?). > Anyone know how I could go about doing this? > > --Aaron >
-
RE: How do I get a JobStatus object?
Aaron Baff 2011-02-17, 18:14
> From: madhu phatak [mailto:[EMAIL PROTECTED]] > > Rather than running jobs by wait for completion you can use jobcontrol to > control the jobs . JobControl give access to the what all jobs are completed >,running and failed etc
This is almost what I want, but that doesn't give me access to the data I'm looking for. I'm specifically looking at org.apache.hadoop.mapreduce.JobStatus and it's getStartTime() and getFinishTime() methods. The only place I've seen to get a JobStatus object is the JobClient getAllJobs(), getJobsFromQueue(), and jobsToComplete().
--Aaron
> > On Thu, Feb 17, 2011 at 12:09 AM, Aaron Baff <[EMAIL PROTECTED]>wrote: > >> I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting >> until it completes with RunningJob.waitForCompletion(). I then want to > get >> how long the entire MR takes, which appears to need the JobStatus since >> RunningJob doesn't provide anything I can use for that. The only way I can >> see how to do it right now is JobClient.getAllJobs(), which gives me an >> array of all the jobs that are submitted (currently running? all previous?). >> Anyone know how I could go about doing this? >> >> --Aaron >>
-
Re: How do I get a JobStatus object?
Harsh J 2011-02-18, 02:55
Hello,
On Thu, Feb 17, 2011 at 12:09 AM, Aaron Baff <[EMAIL PROTECTED]> wrote: > I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting until it completes with RunningJob.waitForCompletion(). I then want to get how long the entire MR takes, which appears to need the JobStatus since RunningJob doesn't provide anything I can use for that. The only way I can see how to do it right now is JobClient.getAllJobs(), which gives me an array of all the jobs that are submitted (currently running? all previous?). Anyone know how I could go about doing this?
The mapreduce.Cluster class in the current release can give you a 'Job' object provided a JobID is known. The Job class also has the information you seek for a particular job (start/finish times and more).
JobClient -> JobStatus results would be out of what the JT carries in its memory at the time of call.
-- Harsh J www.harshj.com
-
RE: How do I get a JobStatus object?
Aaron Baff 2011-02-18, 16:34
> On Thu, Feb 17, 2011 at 12:09 AM, Aaron Baff <[EMAIL PROTECTED]> wrote: >> I'm submitting jobs via JobClient.submitJob(JobConf), and then waiting until it completes with RunningJob.waitForCompletion(). I then want to get how long the entire MR takes, which appears to need the JobStatus since RunningJob doesn't provide anything I can use for that. The only way I can see how to do it right now is JobClient.getAllJobs(), which gives me an array of all the jobs that are submitted (currently running? all previous?). Anyone know how I could go about doing this?
> The mapreduce.Cluster class in the current release can give you a > 'Job' object provided a JobID is known. The Job class also has the > information you seek for a particular job (start/finish times and > more).
> JobClient -> JobStatus results would be out of what the JT carries in > its memory at the time of call. Thanks Harsh, yes, that is exactly what I was looking for. Some of the documentation and ton of classes that all seem very similar to each other make for a confusing situation sometime.
The other issue I'm running into now is how to get the output path using FileOutputFormat.getOutputPath(). However, the property that that looks for does not seem to exist within the job config. When I grab the configuration from the Job, it's toString() shows that it has an HDFS path to a job config xml file, but when I look using the CLI fs client, that file does not exist! Nor does any other output for any job I've run recently! On the jobtracker web interface, I can view it, although that appears to be viewed from the jobtracker, and not hdfs. This is getting very perplexing, unless I missed reading some bit of documentation that makes this all clear.
--Aaron
|
|