-Re: doubt on Hadoop job submission process
Manoj Babu 2012-08-13, 11:13
Thanks for your reply.
Consider from my main program i am doing so
many activities(Reading/writing/updating non hadoop activities) before
Is it anyway to separate the process flow by programmatic instead of going
for any workflow engine?
On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi Manoj,
> Reply inline.
> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> > Hi All,
> > Normal Hadoop job submission process involves:
> > Checking the input and output specifications of the job.
> > Computing the InputSplits for the job.
> > Setup the requisite accounting information for the DistributedCache of
> > job, if necessary.
> > Copying the job's jar and configuration to the map-reduce system
> > on the distributed file-system.
> > Submitting the job to the JobTracker and optionally monitoring it's
> > I have a doubt in 4th point of job execution flow could any of you
> > it?
> > What is job's jar?
> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> setJarByClass calls).
> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> > on the job configuration object?
> It is the former, as explained above.
> Harsh J