Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> doubt on Hadoop job submission process


Copy link to this message
-
Re: doubt on Hadoop job submission process
Hi Harsh,

Thanks for your reply.

Consider from my main program i am doing so
many activities(Reading/writing/updating non hadoop activities) before
invoking JobClient.runJob(conf);
Is it anyway to separate the process flow by programmatic instead of going
for any workflow engine?

Cheers!
Manoj.

On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi Manoj,
>
> Reply inline.
>
> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > Normal Hadoop job submission process involves:
> >
> > Checking the input and output specifications of the job.
> > Computing the InputSplits for the job.
> > Setup the requisite accounting information for the DistributedCache of
> the
> > job, if necessary.
> > Copying the job's jar and configuration to the map-reduce system
> directory
> > on the distributed file-system.
> > Submitting the job to the JobTracker and optionally monitoring it's
> status.
> >
> > I have a doubt in 4th point of  job execution flow could any of you
> explain
> > it?
> >
> > What is job's jar?
>
> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> setJarByClass calls).
>
> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> based
> > on the job configuration object?
>
> It is the former, as explained above.
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB