Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> doubt on Hadoop job submission process


Copy link to this message
-
Re: doubt on Hadoop job submission process
Then i need to submit the jar contains non hadoop activity classes and its
supporting libraries to all the nodes since i can't create two jar's.
Is there anyway to do it optimized?
Cheers!
Manoj.

On Mon, Aug 13, 2012 at 5:20 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Sure, you may separate the logic as you want it to be, but just ensure
> the configuration object has a proper setJar or setJarByClass done on
> it before you submit the job.
>
> On Mon, Aug 13, 2012 at 4:43 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> > Hi Harsh,
> >
> > Thanks for your reply.
> >
> > Consider from my main program i am doing so many
> > activities(Reading/writing/updating non hadoop activities) before
> invoking
> > JobClient.runJob(conf);
> > Is it anyway to separate the process flow by programmatic instead of
> going
> > for any workflow engine?
> >
> > Cheers!
> > Manoj.
> >
> >
> >
> > On Mon, Aug 13, 2012 at 4:10 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >>
> >> Hi Manoj,
> >>
> >> Reply inline.
> >>
> >> On Mon, Aug 13, 2012 at 3:42 PM, Manoj Babu <[EMAIL PROTECTED]> wrote:
> >> > Hi All,
> >> >
> >> > Normal Hadoop job submission process involves:
> >> >
> >> > Checking the input and output specifications of the job.
> >> > Computing the InputSplits for the job.
> >> > Setup the requisite accounting information for the DistributedCache of
> >> > the
> >> > job, if necessary.
> >> > Copying the job's jar and configuration to the map-reduce system
> >> > directory
> >> > on the distributed file-system.
> >> > Submitting the job to the JobTracker and optionally monitoring it's
> >> > status.
> >> >
> >> > I have a doubt in 4th point of  job execution flow could any of you
> >> > explain
> >> > it?
> >> >
> >> > What is job's jar?
> >>
> >> The job.jar is the jar you supply via "hadoop jar <jar>". Technically
> >> though, it is the jar pointed by JobConf.getJar() (Set via setJar or
> >> setJarByClass calls).
> >>
> >> > Is it job's jar is the one we submitted to hadoop or hadoop will build
> >> > based
> >> > on the job configuration object?
> >>
> >> It is the former, as explained above.
> >>
> >> --
> >> Harsh J
> >
> >
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB