|
|
-
Setting jar for embedded Job (Hadoop 0.20.2)
Cyril Briquet 2010-07-27, 02:05
Hi,
I'd like to run a Hadoop (0.20.2) job from within another application, using ToolRunner.
One class of this other application implements the Tool interface. The implemented run() method: * constructs a Job() * sets the input/output/mapper/reducer * sets the jar file by calling job.setJarByClass(). * calls job.waitForCompletion()
The question is: where should the jar file be made available? In the current local directory of the parent application? In the system directory in HDFS? ...?
I'd like to find documentation and learn how this works.
Thank you,
Cyril
-
Re: Setting jar for embedded Job (Hadoop 0.20.2)
Hemanth Yamijala 2010-07-27, 04:32
Hi,
> I'd like to run a Hadoop (0.20.2) job > from within another application, using ToolRunner. > > One class of this other application implements the Tool interface. > The implemented run() method: > * constructs a Job() > * sets the input/output/mapper/reducer > * sets the jar file by calling job.setJarByClass(). > * calls job.waitForCompletion() > > The question is: where should the jar file be made available? > In the current local directory of the parent application? In the system > directory in HDFS? ...? > > I'd like to find documentation and learn how this works. >
If you are planning to use job.setJarByClass, the files only need to be only on your classpath locally where you are running the application. You could look at o.a.h.mapred.JobConf.findContainingJar which is passed the class name you set in setJarByClass to see how the jar file is located.
Thanks Hemanth
-
Re: Setting jar for embedded Job (Hadoop 0.20.2)
Cyril Briquet 2010-07-27, 19:21
Hi,
Thanks, it works!
So I just tried that, to copy the .jar file containing the mapper and reducer classes to the current directory from which I'm running the application launching the Hadoop job. And it works.
Have a great day,
Cyril N.B.: for the record, the stack trace before putting the .jar in the current directoy is copy/pasted at the bottom of this e-mail On Tue, Jul 27, 2010 at 12:32 AM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
> Hi, > > > I'd like to run a Hadoop (0.20.2) job > > from within another application, using ToolRunner. > > > > One class of this other application implements the Tool interface. > > The implemented run() method: > > * constructs a Job() > > * sets the input/output/mapper/reducer > > * sets the jar file by calling job.setJarByClass(). > > * calls job.waitForCompletion() > > > > The question is: where should the jar file be made available? > > In the current local directory of the parent application? In the system > > directory in HDFS? ...? > > > > I'd like to find documentation and learn how this works. > > > > If you are planning to use job.setJarByClass, the files only need to > be only on your classpath locally where you are running the > application. You could look at o.a.h.mapred.JobConf.findContainingJar > which is passed the class name you set in setJarByClass to see how the > jar file is located. > > Thanks > Hemanth > 10/07/27 14:56:11 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10/07/27 14:56:11 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 10/07/27 14:56:11 INFO input.FileInputFormat: Total input paths to process : 1 10/07/27 14:56:11 INFO mapred.JobClient: Running job: job_201007261547_0007 10/07/27 14:56:12 INFO mapred.JobClient: map 0% reduce 0% 10/07/27 14:56:21 INFO mapred.JobClient: Task Id : attempt_201007261547_0007_m_ 000000_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: x.y.z.my.class.name at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassNotFoundException: x.y.z.my.class.name at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) ... 4 more
10/07/27 14:56:27 INFO mapred.JobClient: Task Id : attempt_201007261547_0007_m_000000_1, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: x.y.z.my.class.name at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassNotFoundException: x.y.z.my.class.name at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) ... 4 more
10/07/27 14:56:34 INFO mapred.JobClient: Task Id : attempt_201007261547_0007_m_000000_2, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: x.y.z.my.class.name at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:157) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:569) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: java.lang.ClassNotFoundException: x.y.z.my.class.name at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) ... 4 more
10/07/27 14:56:43 INFO mapred.JobClient: Job complete: job_201007261547_0007 10/07/27 14:56:43 INFO mapred.JobClient: Counters: 3 10/07/27 14:56:43 INFO mapred.JobClient: Job Counte
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext