Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - doubt on Hadoop job submission process


Copy link to this message
-
Re: doubt on Hadoop job submission process
Manoj Babu 2012-08-13, 12:20
Then i need to submit the jar contains non hadoop activity classes and its
supporting libraries to all the nodes since i can't create two jar's.
Is there anyway to do it optimized?
Cheers!
Manoj.

On Mon, Aug 13, 2012 at 5:20 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Sure, you may separate the logic as you want it to be, but just ensure
> the configuration object has a proper setJar or setJarByClass done on
> it before you submit the job.
>
> On Mon, Aug 13, 2012 at 4:43 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> > Hi Harsh,
> >
> > Thanks for your reply.
> >
> > Consider from my main program i am doing so many
> > activities(Reading/writing/updating non hadoop activities) before
> invoking
> > JobClient.runJob(conf);
> > Is it anyway to separate the process flow by programmatic instead of
> going
> > for any workflow engine?
> >
> > Cheers!
> > Manoj.
> >
> >
> >
> > On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >>
> >> Hi Manoj,
> >>
> >> Reply inline.
> >>
> >> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> >> > Hi All,
> >> >
> >> > Normal Hadoop job submission process involves:
> >> >
> >> > Checking the input and output specifications of the job.
> >> > Computing the InputSplits for the job.
> >> > Setup the requisite accounting information for the DistributedCache of
> >> > the
> >> > job, if necessary.
> >> > Copying the job's jar and configuration to the map-reduce system
> >> > directory
> >> > on the distributed file-system.
> >> > Submitting the job to the JobTracker and optionally monitoring it's
> >> > status.
> >> >
> >> > I have a doubt in 4th point of  job execution flow could any of you
> >> > explain
> >> > it?
> >> >
> >> > What is job's jar?
> >>
> >> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> >> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> >> setJarByClass calls).
> >>
> >> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> >> > based
> >> > on the job configuration object?
> >>
> >> It is the former, as explained above.
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>