Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Including third party jar files in Map Reduce job


Copy link to this message
-
RE: Including third party jar files in Map Reduce job
As Bejoy mentioned,

If you have copied the jar to $HADOOP_HOME, then you should copy it to all the nodes in the cluster. (or)

If you want to make use of -libjar option, your application should implement Tool to support generic options. Please check the below link for more details.

http://hadoop.apache.org/common/docs/current/commands_manual.html#jar

Thanks
Devaraj
________________________________________
From: Bejoy Ks [[EMAIL PROTECTED]]
Sent: Wednesday, April 04, 2012 1:06 PM
To: [EMAIL PROTECTED]
Subject: Re: Including third party jar files in Map Reduce job

Hi Utkarsh
         You can add third party jars to your map reduce job elegantly in the following ways

1) use - libjars
hadoop jar jarName.jar com.driver.ClassName -libjars /home/some/dir/somejar.jar ....

2) include the third pary jars in /lib folder while packaging your application

3) If you are adding the jar in HADOOP_HOME/lib , you need to add this at all nodes.

Regards
Bejoy KS

On Wed, Apr 4, 2012 at 12:55 PM, Utkarsh Gupta <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Hi Devaraj,

I have already copied the required jar file in $HADOOP_HOME/lib folder.
Can you tell me where to add generic option -libjars

The stack trace is:
hadoop$ bin/hadoop jar WordCount.jar /user/hduser1/input/ /user/hduser1/output
12/04/04 12:45:51 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/04 12:45:51 INFO input.FileInputFormat: Total input paths to process : 1
12/04/04 12:45:51 INFO mapred.JobClient: Running job: job_201204041107_0005
12/04/04 12:45:52 INFO mapred.JobClient:  map 0% reduce 0%
12/04/04 12:46:07 INFO mapred.JobClient: Task Id : attempt_201204041107_0005_m_000000_0, Status : FAILED
Error: java.lang.ClassNotFoundException: org.apache.commons.math3.random.RandomDataImpl
       at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
       at wordcount.MyMapper.map(MyMapper.java:22)
       at wordcount.MyMapper.map(MyMapper.java:14)
       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
       at java.security.AccessController.doPrivileged(Native Method)
       at javax.security.auth.Subject.doAs(Subject.java:396)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
       at org.apache.hadoop.mapred.Child.main(Child.java:253)

Thanks and Regards
Utkarsh

-----Original Message-----
From: Devaraj k [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Wednesday, April 04, 2012 12:35 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: RE: Including third party jar files in Map Reduce job

Hi Utkarsh,

The usage of the jar command is like this,

Usage: hadoop jar <jar> [mainClass] args...

If you want the commons-math3.jar to be available for all the tasks you can do any one of these 1. Copy the jar file in $HADOOP_HOME/lib dir or 2. Use the generic option -libjars.

Can you give the stack trace of your problem for which class it is giving ClassNotFoundException(i.e for main class or math lib class)?

Thanks
Devaraj
________________________________________
From: Utkarsh Gupta [[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Wednesday, April 04, 2012 12:22 PM
To: [EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>
Subject: Including third party jar files in Map Reduce job

Hi All,

I am new to Hadoop and was trying to generate random numbers using apache commons math library.
I used Netbeans to build the jar file and the manifest has path to commons-math jar as lib/commons-math3.jar I have placed this jar file in HADOOP_HOME/lib folder but still I am getting ClassNotFoundException.
I tried using -libjars option with $HADOOP_HOME/bin/Hadoop jar myprg.jar <inputpath> <outputpath> -libjars <jarpath> And $HADOOP_HOME/bin/Hadoop jar myprg.jar -libjar <jarpath> <inputpath> <outputpath> But this is not working. Please help.
Thanks and Regards
Utkarsh Gupta

**************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***