Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - doubt on Hadoop job submission process


Copy link to this message
-
doubt on Hadoop job submission process
Manoj Babu 2012-08-13, 10:12
Hi All,

Normal Hadoop job submission process involves:

   1. Checking the input and output specifications of the job.
   2. Computing the
InputSplit<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/InputSplit.html>s
   for the job.
   3. Setup the requisite accounting information for the
DistributedCache<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/filecache/DistributedCache.html>
of
   the job, if necessary.
   4. Copying the job's jar and configuration to the map-reduce system
   directory on the distributed file-system.
   5. Submitting the job to the JobTracker and optionally monitoring it's
   status.

I have a doubt in 4th point of  job execution flow could any of you explain
it?

   - What is job's jar?
   - Is it job's jar is the one we submitted to hadoop or hadoop will build
   based on the job configuration object?

Cheers!
Manoj.