Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> doubt on Hadoop job submission process


Copy link to this message
-
Re: doubt on Hadoop job submission process
Hi Harsh,

Thanks for your reply.

Consider from my main program i am doing so
many activities(Reading/writing/updating non hadoop activities) before
invoking JobClient.runJob(conf);
Is it anyway to separate the process flow by programmatic instead of going
for any workflow engine?

Cheers!
Manoj.

On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Hi Manoj,
>
> Reply inline.
>
> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> > Hi All,
> >
> > Normal Hadoop job submission process involves:
> >
> > Checking the input and output specifications of the job.
> > Computing the InputSplits for the job.
> > Setup the requisite accounting information for the DistributedCache of
> the
> > job, if necessary.
> > Copying the job's jar and configuration to the map-reduce system
> directory
> > on the distributed file-system.
> > Submitting the job to the JobTracker and optionally monitoring it's
> status.
> >
> > I have a doubt in 4th point of  job execution flow could any of you
> explain
> > it?
> >
> > What is job's jar?
>
> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> setJarByClass calls).
>
> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> based
> > on the job configuration object?
>
> It is the former, as explained above.
>
> --
> Harsh J
>