The Java API of JobClient class lets you query all jobs and provides
some task-level info as a public API.
In YARN (2.x onwards), the MRv2's AM publishes a REST API that lets
you query it (the RM lets you get a list of such AMs as well, as a
first step). This sounds more like what you need.
A REST API similar to what was added in YARN for MRv2 was also added
to the 1.x's JobTracker recently via
https://issues.apache.org/jira/browse/MAPREDUCE-4837, appearing in
1.2.0 release onwards.
On Thu, Mar 7, 2013 at 12:13 AM, Kyle B <[EMAIL PROTECTED]> wrote:
> I was wondering if the Hadoop job tracker had an API, such as a web service
> or xml feed? I'm trying to track Hadoop jobs as they progress. Right now,
> I'm parsing the HTML of the "Running Jobs" section at
> http://hadoop:50030/jobtracker.jsp, but this is definitely not desired if
> there is a better way. Is there a simple web service for the data on
> jobtracker.php & jobdetails.jsp?
> Has anyone run into this problem of trying to track Hadoop progress from a
> remote machine programatically?
> Any help is appreciated,