Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> execute hadoop job from remote web application


Copy link to this message
-
Re: execute hadoop job from remote web application
Hi Oleg
          I haven't tried out a scenario like you mentioned. But I think
there shouldn't be any issue in submitting a job that has some dependent
classes which holds the business logic referred from mapper,reducer or
combiner. You should be able to do the job submission remotely the same we
were discussing in this thread. If you need to distribute any dependent
jars/files along with the application jar, you can use the -libjars option
in CLI or use the DistributedCache methods like
addArchiveToClassPath()/addFileToClassPath() in your java code. If it is a
dependent jar It is better to deploy the same in the cluster environment
itself so that every time when you submit your job you don't have to
transfer the jar over the network again and again.
         Just a suggestion, if you can execute the job from within your
hadoop cluster you don't have to do a remote job submission. You just need
to remotely invoke the shellscript that contains the hadoop jar command with
any required input arguments. Sorry if I'm not getting your requirement
exactly.

Regards
Bejoy.K.S

On Tue, Oct 18, 2011 at 6:29 PM, Oleg Ruchovets <[EMAIL PROTECTED]>wrote:

> Thanks  you all for your answers but I still have a questions:
>  Currently we running our jobs using shell scripts which locates on hadoop
> master machine.
>
> Here is an example of command line:
> /opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar
> -inputPath /opt/inputs/  -outputPath /data/output_jobs/output
>
> my_hadoop_job.jar has a class which parse input parameters and submit a
> job.
> Our code is very similar like you wrote:
>   ......
>
>        job.setJarByClass(HadoopJobExecutor.class);
>        job.setMapperClass(MultipleOutputMap.class);
>        job.setCombinerClass(BaseCombine.class);
>        job.setReducerClass(HBaseReducer.class);
>        job.setOutputKeyClass(Text.class);
>        job.setOutputValueClass(MapWritable.class);
>
>        FileOutputFormat.setOutputPath(job, new Path(finalOutPutPath));
>
>        jobCompleteStatus = job.waitForCompletion(true);
> ...............
>
> my question are:
>
> 1) my_hadoop_job.jar contains another classes (business logic) not only
> Map,Combine,Reduce classes and I still don't understand how can I submit
> job
> which needs all classes from my_hadoop_job.jar?
> 2) Do I need to submit a my_hadoop_job.jar too? If yes what is the way to
> do
> it?
>
> Thanks In Advance
> Oleg.
>
> On Tue, Oct 18, 2011 at 2:11 PM, Uma Maheswara Rao G 72686 <
> [EMAIL PROTECTED]> wrote:
>
> > ----- Original Message -----
> > From: Bejoy KS <[EMAIL PROTECTED]>
> > Date: Tuesday, October 18, 2011 5:25 pm
> > Subject: Re: execute hadoop job from remote web application
> > To: [EMAIL PROTECTED]
> >
> > > Oleg
> > >      If you are looking at how to submit your jobs using
> > > JobClient then the
> > > below sample can give you a start.
> > >
> > > //get the configuration parameters and assigns a job name
> > >        JobConf conf = new JobConf(getConf(), MyClass.class);
> > >        conf.setJobName("SMS Reports");
> > >
> > >        //setting key value types for mapper and reducer outputs
> > >        conf.setOutputKeyClass(Text.class);
> > >        conf.setOutputValueClass(Text.class);
> > >
> > >        //specifying the custom reducer class
> > >        conf.setReducerClass(SmsReducer.class);
> > >
> > >        //Specifying the input directories(@ runtime) and Mappers
> > > independently for inputs from multiple sources
> > >        FileInputFormat.addInputPath(conf, new Path(args[0]));
> > >
> > >        //Specifying the output directory @ runtime
> > >        FileOutputFormat.setOutputPath(conf, new Path(args[1]));
> > >
> > >        JobClient.runJob(conf);
> > >
> > > Along with the hadoop jars you may need to have the config files
> > > as well on
> > > your client.
> > >
> > > The sample is from old map reduce API. You can use the new one as
> > > well in
> > > that we use the Job instead of JobClient.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB