Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> doubt on Hadoop job submission process


Copy link to this message
-
doubt on Hadoop job submission process
Hi All,

Normal Hadoop job submission process involves:

   1. Checking the input and output specifications of the job.
   2. Computing the
InputSplit<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/InputSplit.html>s
   for the job.
   3. Setup the requisite accounting information for the
DistributedCache<http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/filecache/DistributedCache.html>
of
   the job, if necessary.
   4. Copying the job's jar and configuration to the map-reduce system
   directory on the distributed file-system.
   5. Submitting the job to the JobTracker and optionally monitoring it's
   status.

I have a doubt in 4th point of  job execution flow could any of you explain
it?

   - What is job's jar?
   - Is it job's jar is the one we submitted to hadoop or hadoop will build
   based on the job configuration object?

Cheers!
Manoj.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB