|
Shuja Rehman
2011-04-04, 15:06
Mark Kerzner
2011-04-04, 15:17
Mark Kerzner
2011-04-04, 15:18
Mark Kerzner
2011-04-04, 16:40
Allen Wittenauer
2011-04-04, 17:06
Marco Didonna
2011-04-04, 17:16
Mark Kerzner
2011-04-04, 17:20
Shuja Rehman
2011-04-04, 18:31
James Seigel
2011-04-04, 18:40
Bill Graham
2011-04-04, 20:59
Shuja Rehman
2011-04-06, 08:44
Bill Graham
2011-04-06, 15:29
Shuja Rehman
2011-04-06, 18:31
Bill Graham
2011-04-06, 20:17
Guy Doulberg
2011-04-07, 07:26
|
-
Including Additional JarsShuja Rehman 2011-04-04, 15:06
Hi All
I have created a map reduce job and to run on it on the cluster, i have bundled all jars(hadoop, hbase etc) into single jar which increases the size of overall file. During the development process, i need to copy again and again this complete file which is very time consuming so is there any way that i just copy the program jar only and do not need to copy the lib files again and again. i am using net beans to develop the program. kindly let me know how to solve this issue? Thanks -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
-
Re: Including Additional JarsMark Kerzner 2011-04-04, 15:17
Shuja,
here is what I do in NB environment #!/bin/sh cd ../dist jar -xf Chapter1.jar jar -cmf META-INF/MANIFEST.MF ../Chapter3-for-Hadoop.jar * cd ../bin echo "Repackaged for Hadoop" and it does the job. I run it only when I want to build this jar. Mark On Mon, Apr 4, 2011 at 10:06 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > Hi All > > I have created a map reduce job and to run on it on the cluster, i have > bundled all jars(hadoop, hbase etc) into single jar which increases the > size > of overall file. During the development process, i need to copy again and > again this complete file which is very time consuming so is there any way > that i just copy the program jar only and do not need to copy the lib files > again and again. i am using net beans to develop the program. > > kindly let me know how to solve this issue? > > Thanks > > -- > Regards > Shuja-ur-Rehman Baig > <http://pk.linkedin.com/in/shujamughal> >
-
Re: Including Additional JarsMark Kerzner 2011-04-04, 15:18
That was for my book (chapter 1 attached, you may find other things useful),
but you would substitute it with your project name. Mark On Mon, Apr 4, 2011 at 10:17 AM, Mark Kerzner <[EMAIL PROTECTED]> wrote: > Shuja, > > here is what I do in NB environment > > #!/bin/sh > cd ../dist > jar -xf Chapter1.jar > jar -cmf META-INF/MANIFEST.MF ../Chapter3-for-Hadoop.jar * > cd ../bin > echo "Repackaged for Hadoop" > > and it does the job. I run it only when I want to build this jar. > > Mark > > On Mon, Apr 4, 2011 at 10:06 AM, Shuja Rehman <[EMAIL PROTECTED]>wrote: > >> Hi All >> >> I have created a map reduce job and to run on it on the cluster, i have >> bundled all jars(hadoop, hbase etc) into single jar which increases the >> size >> of overall file. During the development process, i need to copy again and >> again this complete file which is very time consuming so is there any way >> that i just copy the program jar only and do not need to copy the lib >> files >> again and again. i am using net beans to develop the program. >> >> kindly let me know how to solve this issue? >> >> Thanks >> >> -- >> Regards >> Shuja-ur-Rehman Baig >> <http://pk.linkedin.com/in/shujamughal> >> > >
-
Re: Including Additional JarsMark Kerzner 2011-04-04, 16:40
Then it seems you want to do the opposite of what I have done in this
script. I AM combining all the jars in one jar, and you already have that. Rather, you want to distribute only your app jar, and put the other ones in the lib folder on the server. I know that when you run a standard MR job, you only need to mention your jar, and the other Hadoop jars already come from the lib. In other words, you should be able to run it like this: hadoop jar your-jar parameters Since you are using Cloudera distro, this runs the following /usr/bin/hadoop-0.20 which in turn runs this script #!/bin/sh export HADOOP_HOME=/usr/lib/hadoop-0.20 exec /usr/lib/hadoop-0.20/bin/hadoop "$@" Since HADOOP_HOME is set, it knows that the libraries are in here /usr/lib/hadoop-0.20/lib/ therefore, I think that if you put your additional libraries in the same folder, it should just pick them up. Sincerely, Mark On Mon, Apr 4, 2011 at 11:31 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > hi, > i do not understand it. can u take my explain it with my example? > > I have following jars in lib folder of dist created by netbeans > (dist/lib/). > > commons-logging-1.1.1.jar > guava-r07.jar > hadoop-0.20.2+737-core.jar > hbase.jar > hbase-0.89.20100924+28.jar > log4j-1.2.15.jar > mysql-connector-java-5.1.7-bin.jar > UIDataTransporter.jar > zookeeper.jar > > and dist folder contains only > > MyProgram.jar > > > at the moment, i am combining all jars files to produce the single file. > but now i want to just put the dist/lib/ *.jars for once on server and only > MyProgram.jar should be copied everytime i change the code. > > so can u transfer ur code according to my example??? > Thanks > > > > > On Mon, Apr 4, 2011 at 8:17 PM, Mark Kerzner <[EMAIL PROTECTED]>wrote: > >> Shuja, >> >> here is what I do in NB environment >> >> #!/bin/sh >> cd ../dist >> jar -xf Chapter1.jar >> jar -cmf META-INF/MANIFEST.MF ../Chapter3-for-Hadoop.jar * >> cd ../bin >> echo "Repackaged for Hadoop" >> >> and it does the job. I run it only when I want to build this jar. >> >> Mark >> >> On Mon, Apr 4, 2011 at 10:06 AM, Shuja Rehman <[EMAIL PROTECTED]>wrote: >> >>> Hi All >>> >>> I have created a map reduce job and to run on it on the cluster, i have >>> bundled all jars(hadoop, hbase etc) into single jar which increases the >>> size >>> of overall file. During the development process, i need to copy again and >>> again this complete file which is very time consuming so is there any way >>> that i just copy the program jar only and do not need to copy the lib >>> files >>> again and again. i am using net beans to develop the program. >>> >>> kindly let me know how to solve this issue? >>> >>> Thanks >>> >>> -- >>> Regards >>> Shuja-ur-Rehman Baig >>> <http://pk.linkedin.com/in/shujamughal> >>> >> >> > > > -- > Regards > Shuja-ur-Rehman Baig > <http://pk.linkedin.com/in/shujamughal> > >
-
Re: Including Additional JarsAllen Wittenauer 2011-04-04, 17:06
On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: > Hi All > > I have created a map reduce job and to run on it on the cluster, i have > bundled all jars(hadoop, hbase etc) into single jar which increases the size > of overall file. During the development process, i need to copy again and > again this complete file which is very time consuming so is there any way > that i just copy the program jar only and do not need to copy the lib files > again and again. i am using net beans to develop the program. > > kindly let me know how to solve this issue? This was in the FAQ, but in a non-obvious place. I've updated it to be more visible (hopefully): http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F
-
Re: Including Additional JarsMarco Didonna 2011-04-04, 17:16
On 04/04/2011 07:06 PM, Allen Wittenauer wrote:
> > On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: > >> Hi All >> >> I have created a map reduce job and to run on it on the cluster, i have >> bundled all jars(hadoop, hbase etc) into single jar which increases the size >> of overall file. During the development process, i need to copy again and >> again this complete file which is very time consuming so is there any way >> that i just copy the program jar only and do not need to copy the lib files >> again and again. i am using net beans to develop the program. >> >> kindly let me know how to solve this issue? > > This was in the FAQ, but in a non-obvious place. I've updated it to be more visible (hopefully): > > http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F Does the same apply to jar containing libraries? Let's suppose I need lucene-core.jar to run my project. Can I put my this jar into my job jar and have hadoop "see" lucene's classes? Or should I use distributed cache?? MD
-
Re: Including Additional JarsMark Kerzner 2011-04-04, 17:20
I think you can put them either in your jar or in distributed cache.
As Allen pointed out, my idea of putting them into hadoop lib jar was wrong. Mark On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED]>wrote: > On 04/04/2011 07:06 PM, Allen Wittenauer wrote: > >> >> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >> >> Hi All >>> >>> I have created a map reduce job and to run on it on the cluster, i have >>> bundled all jars(hadoop, hbase etc) into single jar which increases the >>> size >>> of overall file. During the development process, i need to copy again and >>> again this complete file which is very time consuming so is there any way >>> that i just copy the program jar only and do not need to copy the lib >>> files >>> again and again. i am using net beans to develop the program. >>> >>> kindly let me know how to solve this issue? >>> >> >> This was in the FAQ, but in a non-obvious place. I've updated it >> to be more visible (hopefully): >> >> >> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >> > > Does the same apply to jar containing libraries? Let's suppose I need > lucene-core.jar to run my project. Can I put my this jar into my job jar and > have hadoop "see" lucene's classes? Or should I use distributed cache?? > > MD > >
-
Re: Including Additional JarsShuja Rehman 2011-04-04, 18:31
well...i think to put in distributed cache is good idea. do u have any
working example how to put extra jars in distributed cache and how to make available these jars for job? Thanks On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote: > I think you can put them either in your jar or in distributed cache. > > As Allen pointed out, my idea of putting them into hadoop lib jar was > wrong. > > Mark > > On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED] > >wrote: > > > On 04/04/2011 07:06 PM, Allen Wittenauer wrote: > > > >> > >> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: > >> > >> Hi All > >>> > >>> I have created a map reduce job and to run on it on the cluster, i have > >>> bundled all jars(hadoop, hbase etc) into single jar which increases the > >>> size > >>> of overall file. During the development process, i need to copy again > and > >>> again this complete file which is very time consuming so is there any > way > >>> that i just copy the program jar only and do not need to copy the lib > >>> files > >>> again and again. i am using net beans to develop the program. > >>> > >>> kindly let me know how to solve this issue? > >>> > >> > >> This was in the FAQ, but in a non-obvious place. I've updated it > >> to be more visible (hopefully): > >> > >> > >> > http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F > >> > > > > Does the same apply to jar containing libraries? Let's suppose I need > > lucene-core.jar to run my project. Can I put my this jar into my job jar > and > > have hadoop "see" lucene's classes? Or should I use distributed cache?? > > > > MD > > > > > -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
-
Re: Including Additional JarsJames Seigel 2011-04-04, 18:40
James’ quick and dirty, get your job running guideline:
-libjars <-- for jars you want accessible by the mappers and reducers classpath or bundled in the main jar <-- for jars you want accessible to the runner Cheers James. On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: > well...i think to put in distributed cache is good idea. do u have any > working example how to put extra jars in distributed cache and how to make > available these jars for job? > Thanks > > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote: > >> I think you can put them either in your jar or in distributed cache. >> >> As Allen pointed out, my idea of putting them into hadoop lib jar was >> wrong. >> >> Mark >> >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED] >>> wrote: >> >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: >>> >>>> >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >>>> >>>> Hi All >>>>> >>>>> I have created a map reduce job and to run on it on the cluster, i have >>>>> bundled all jars(hadoop, hbase etc) into single jar which increases the >>>>> size >>>>> of overall file. During the development process, i need to copy again >> and >>>>> again this complete file which is very time consuming so is there any >> way >>>>> that i just copy the program jar only and do not need to copy the lib >>>>> files >>>>> again and again. i am using net beans to develop the program. >>>>> >>>>> kindly let me know how to solve this issue? >>>>> >>>> >>>> This was in the FAQ, but in a non-obvious place. I've updated it >>>> to be more visible (hopefully): >>>> >>>> >>>> >> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >>>> >>> >>> Does the same apply to jar containing libraries? Let's suppose I need >>> lucene-core.jar to run my project. Can I put my this jar into my job jar >> and >>> have hadoop "see" lucene's classes? Or should I use distributed cache?? >>> >>> MD >>> >>> >> > > > > -- > Regards > Shuja-ur-Rehman Baig > <http://pk.linkedin.com/in/shujamughal>
-
Re: Including Additional JarsBill Graham 2011-04-04, 20:59
Shuja, I haven't tried this, but from what I've read it seems you
could just add all your jars required by the Mapper and Reducer to HDFS and then add them to the classpath in your run() method like this: DistributedCache.addFileToClassPath(new Path("/myapp/mylib.jar"), job); I think that's all there is to it, but like I said, I haven't tried it. Just be sure your run() method isn't in the same class as your mapper/reducer if they import packages from any of the distributed cache jars. On Mon, Apr 4, 2011 at 11:40 AM, James Seigel <[EMAIL PROTECTED]> wrote: > James’ quick and dirty, get your job running guideline: > > -libjars <-- for jars you want accessible by the mappers and reducers > classpath or bundled in the main jar <-- for jars you want accessible to the runner > > Cheers > James. > > > > On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: > >> well...i think to put in distributed cache is good idea. do u have any >> working example how to put extra jars in distributed cache and how to make >> available these jars for job? >> Thanks >> >> On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote: >> >>> I think you can put them either in your jar or in distributed cache. >>> >>> As Allen pointed out, my idea of putting them into hadoop lib jar was >>> wrong. >>> >>> Mark >>> >>> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED] >>>> wrote: >>> >>>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: >>>> >>>>> >>>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >>>>> >>>>> Hi All >>>>>> >>>>>> I have created a map reduce job and to run on it on the cluster, i have >>>>>> bundled all jars(hadoop, hbase etc) into single jar which increases the >>>>>> size >>>>>> of overall file. During the development process, i need to copy again >>> and >>>>>> again this complete file which is very time consuming so is there any >>> way >>>>>> that i just copy the program jar only and do not need to copy the lib >>>>>> files >>>>>> again and again. i am using net beans to develop the program. >>>>>> >>>>>> kindly let me know how to solve this issue? >>>>>> >>>>> >>>>> This was in the FAQ, but in a non-obvious place. I've updated it >>>>> to be more visible (hopefully): >>>>> >>>>> >>>>> >>> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >>>>> >>>> >>>> Does the same apply to jar containing libraries? Let's suppose I need >>>> lucene-core.jar to run my project. Can I put my this jar into my job jar >>> and >>>> have hadoop "see" lucene's classes? Or should I use distributed cache?? >>>> >>>> MD >>>> >>>> >>> >> >> >> >> -- >> Regards >> Shuja-ur-Rehman Baig >> <http://pk.linkedin.com/in/shujamughal> > >
-
Re: Including Additional JarsShuja Rehman 2011-04-06, 08:44
-libjars is not working nor distributed cache, any other
solution?????????????????????????????????????????? On Mon, Apr 4, 2011 at 11:40 PM, James Seigel <[EMAIL PROTECTED]> wrote: > James’ quick and dirty, get your job running guideline: > > -libjars <-- for jars you want accessible by the mappers and reducers > classpath or bundled in the main jar <-- for jars you want accessible to > the runner > > Cheers > James. > > > > On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: > > > well...i think to put in distributed cache is good idea. do u have any > > working example how to put extra jars in distributed cache and how to > make > > available these jars for job? > > Thanks > > > > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> > wrote: > > > >> I think you can put them either in your jar or in distributed cache. > >> > >> As Allen pointed out, my idea of putting them into hadoop lib jar was > >> wrong. > >> > >> Mark > >> > >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED] > >>> wrote: > >> > >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: > >>> > >>>> > >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: > >>>> > >>>> Hi All > >>>>> > >>>>> I have created a map reduce job and to run on it on the cluster, i > have > >>>>> bundled all jars(hadoop, hbase etc) into single jar which increases > the > >>>>> size > >>>>> of overall file. During the development process, i need to copy again > >> and > >>>>> again this complete file which is very time consuming so is there any > >> way > >>>>> that i just copy the program jar only and do not need to copy the lib > >>>>> files > >>>>> again and again. i am using net beans to develop the program. > >>>>> > >>>>> kindly let me know how to solve this issue? > >>>>> > >>>> > >>>> This was in the FAQ, but in a non-obvious place. I've updated > it > >>>> to be more visible (hopefully): > >>>> > >>>> > >>>> > >> > http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F > >>>> > >>> > >>> Does the same apply to jar containing libraries? Let's suppose I need > >>> lucene-core.jar to run my project. Can I put my this jar into my job > jar > >> and > >>> have hadoop "see" lucene's classes? Or should I use distributed cache?? > >>> > >>> MD > >>> > >>> > >> > > > > > > > > -- > > Regards > > Shuja-ur-Rehman Baig > > <http://pk.linkedin.com/in/shujamughal> > > -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
-
Re: Including Additional JarsBill Graham 2011-04-06, 15:29
If you could share more specifics regarding just how it's not working
(i.e., job specifics, stack traces, how you're invoking it, etc), you might get more assistance in troubleshooting. On Wed, Apr 6, 2011 at 1:44 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > -libjars is not working nor distributed cache, any other > solution?????????????????????????????????????????? > > On Mon, Apr 4, 2011 at 11:40 PM, James Seigel <[EMAIL PROTECTED]> wrote: > >> James’ quick and dirty, get your job running guideline: >> >> -libjars <-- for jars you want accessible by the mappers and reducers >> classpath or bundled in the main jar <-- for jars you want accessible to >> the runner >> >> Cheers >> James. >> >> >> >> On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: >> >> > well...i think to put in distributed cache is good idea. do u have any >> > working example how to put extra jars in distributed cache and how to >> make >> > available these jars for job? >> > Thanks >> > >> > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> >> wrote: >> > >> >> I think you can put them either in your jar or in distributed cache. >> >> >> >> As Allen pointed out, my idea of putting them into hadoop lib jar was >> >> wrong. >> >> >> >> Mark >> >> >> >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna <[EMAIL PROTECTED] >> >>> wrote: >> >> >> >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: >> >>> >> >>>> >> >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >> >>>> >> >>>> Hi All >> >>>>> >> >>>>> I have created a map reduce job and to run on it on the cluster, i >> have >> >>>>> bundled all jars(hadoop, hbase etc) into single jar which increases >> the >> >>>>> size >> >>>>> of overall file. During the development process, i need to copy again >> >> and >> >>>>> again this complete file which is very time consuming so is there any >> >> way >> >>>>> that i just copy the program jar only and do not need to copy the lib >> >>>>> files >> >>>>> again and again. i am using net beans to develop the program. >> >>>>> >> >>>>> kindly let me know how to solve this issue? >> >>>>> >> >>>> >> >>>> This was in the FAQ, but in a non-obvious place. I've updated >> it >> >>>> to be more visible (hopefully): >> >>>> >> >>>> >> >>>> >> >> >> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >> >>>> >> >>> >> >>> Does the same apply to jar containing libraries? Let's suppose I need >> >>> lucene-core.jar to run my project. Can I put my this jar into my job >> jar >> >> and >> >>> have hadoop "see" lucene's classes? Or should I use distributed cache?? >> >>> >> >>> MD >> >>> >> >>> >> >> >> > >> > >> > >> > -- >> > Regards >> > Shuja-ur-Rehman Baig >> > <http://pk.linkedin.com/in/shujamughal> >> >> > > > -- > Regards > Shuja-ur-Rehman Baig > <http://pk.linkedin.com/in/shujamughal> >
-
Re: Including Additional JarsShuja Rehman 2011-04-06, 18:31
i am using the following command
*hadoop jar myjar.jar -libjars /home/shuja/lib/mylib.jar param1 param2 param3* but the program still giving the error and does not find the mylib.jar. can u confirm the syntax of command? thnx On Wed, Apr 6, 2011 at 8:29 PM, Bill Graham <[EMAIL PROTECTED]> wrote: > If you could share more specifics regarding just how it's not working > (i.e., job specifics, stack traces, how you're invoking it, etc), you > might get more assistance in troubleshooting. > > > On Wed, Apr 6, 2011 at 1:44 AM, Shuja Rehman <[EMAIL PROTECTED]> > wrote: > > -libjars is not working nor distributed cache, any other > > solution?????????????????????????????????????????? > > > > On Mon, Apr 4, 2011 at 11:40 PM, James Seigel <[EMAIL PROTECTED]> wrote: > > > >> James’ quick and dirty, get your job running guideline: > >> > >> -libjars <-- for jars you want accessible by the mappers and reducers > >> classpath or bundled in the main jar <-- for jars you want accessible to > >> the runner > >> > >> Cheers > >> James. > >> > >> > >> > >> On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: > >> > >> > well...i think to put in distributed cache is good idea. do u have any > >> > working example how to put extra jars in distributed cache and how to > >> make > >> > available these jars for job? > >> > Thanks > >> > > >> > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> > >> wrote: > >> > > >> >> I think you can put them either in your jar or in distributed cache. > >> >> > >> >> As Allen pointed out, my idea of putting them into hadoop lib jar was > >> >> wrong. > >> >> > >> >> Mark > >> >> > >> >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna < > [EMAIL PROTECTED] > >> >>> wrote: > >> >> > >> >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: > >> >>> > >> >>>> > >> >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: > >> >>>> > >> >>>> Hi All > >> >>>>> > >> >>>>> I have created a map reduce job and to run on it on the cluster, i > >> have > >> >>>>> bundled all jars(hadoop, hbase etc) into single jar which > increases > >> the > >> >>>>> size > >> >>>>> of overall file. During the development process, i need to copy > again > >> >> and > >> >>>>> again this complete file which is very time consuming so is there > any > >> >> way > >> >>>>> that i just copy the program jar only and do not need to copy the > lib > >> >>>>> files > >> >>>>> again and again. i am using net beans to develop the program. > >> >>>>> > >> >>>>> kindly let me know how to solve this issue? > >> >>>>> > >> >>>> > >> >>>> This was in the FAQ, but in a non-obvious place. I've > updated > >> it > >> >>>> to be more visible (hopefully): > >> >>>> > >> >>>> > >> >>>> > >> >> > >> > http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F > >> >>>> > >> >>> > >> >>> Does the same apply to jar containing libraries? Let's suppose I > need > >> >>> lucene-core.jar to run my project. Can I put my this jar into my job > >> jar > >> >> and > >> >>> have hadoop "see" lucene's classes? Or should I use distributed > cache?? > >> >>> > >> >>> MD > >> >>> > >> >>> > >> >> > >> > > >> > > >> > > >> > -- > >> > Regards > >> > Shuja-ur-Rehman Baig > >> > <http://pk.linkedin.com/in/shujamughal> > >> > >> > > > > > > -- > > Regards > > Shuja-ur-Rehman Baig > > <http://pk.linkedin.com/in/shujamughal> > > > -- Regards Shuja-ur-Rehman Baig <http://pk.linkedin.com/in/shujamughal>
-
Re: Including Additional JarsBill Graham 2011-04-06, 20:17
You need to pass the mainClass after the jar:
http://hadoop.apache.org/common/docs/r0.21.0/commands_manual.html#jar On Wed, Apr 6, 2011 at 11:31 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > i am using the following command > > hadoop jar myjar.jar -libjars /home/shuja/lib/mylib.jar param1 param2 > param3 > > but the program still giving the error and does not find the mylib.jar. can > u confirm the syntax of command? > thnx > > > > On Wed, Apr 6, 2011 at 8:29 PM, Bill Graham <[EMAIL PROTECTED]> wrote: >> >> If you could share more specifics regarding just how it's not working >> (i.e., job specifics, stack traces, how you're invoking it, etc), you >> might get more assistance in troubleshooting. >> >> >> On Wed, Apr 6, 2011 at 1:44 AM, Shuja Rehman <[EMAIL PROTECTED]> >> wrote: >> > -libjars is not working nor distributed cache, any other >> > solution?????????????????????????????????????????? >> > >> > On Mon, Apr 4, 2011 at 11:40 PM, James Seigel <[EMAIL PROTECTED]> wrote: >> > >> >> James’ quick and dirty, get your job running guideline: >> >> >> >> -libjars <-- for jars you want accessible by the mappers and reducers >> >> classpath or bundled in the main jar <-- for jars you want accessible >> >> to >> >> the runner >> >> >> >> Cheers >> >> James. >> >> >> >> >> >> >> >> On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: >> >> >> >> > well...i think to put in distributed cache is good idea. do u have >> >> > any >> >> > working example how to put extra jars in distributed cache and how to >> >> make >> >> > available these jars for job? >> >> > Thanks >> >> > >> >> > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> >> >> wrote: >> >> > >> >> >> I think you can put them either in your jar or in distributed cache. >> >> >> >> >> >> As Allen pointed out, my idea of putting them into hadoop lib jar >> >> >> was >> >> >> wrong. >> >> >> >> >> >> Mark >> >> >> >> >> >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna >> >> >> <[EMAIL PROTECTED] >> >> >>> wrote: >> >> >> >> >> >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: >> >> >>> >> >> >>>> >> >> >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >> >> >>>> >> >> >>>> Hi All >> >> >>>>> >> >> >>>>> I have created a map reduce job and to run on it on the cluster, >> >> >>>>> i >> >> have >> >> >>>>> bundled all jars(hadoop, hbase etc) into single jar which >> >> >>>>> increases >> >> the >> >> >>>>> size >> >> >>>>> of overall file. During the development process, i need to copy >> >> >>>>> again >> >> >> and >> >> >>>>> again this complete file which is very time consuming so is there >> >> >>>>> any >> >> >> way >> >> >>>>> that i just copy the program jar only and do not need to copy the >> >> >>>>> lib >> >> >>>>> files >> >> >>>>> again and again. i am using net beans to develop the program. >> >> >>>>> >> >> >>>>> kindly let me know how to solve this issue? >> >> >>>>> >> >> >>>> >> >> >>>> This was in the FAQ, but in a non-obvious place. I've >> >> >>>> updated >> >> it >> >> >>>> to be more visible (hopefully): >> >> >>>> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >> >> >>>> >> >> >>> >> >> >>> Does the same apply to jar containing libraries? Let's suppose I >> >> >>> need >> >> >>> lucene-core.jar to run my project. Can I put my this jar into my >> >> >>> job >> >> jar >> >> >> and >> >> >>> have hadoop "see" lucene's classes? Or should I use distributed >> >> >>> cache?? >> >> >>> >> >> >>> MD >> >> >>> >> >> >>> >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Regards >> >> > Shuja-ur-Rehman Baig >> >> > <http://pk.linkedin.com/in/shujamughal> >> >> >> >> >> > >> > >> > -- >> > Regards >> > Shuja-ur-Rehman Baig >> > <http://pk.linkedin.com/in/shujamughal> >> > > > > > -- > Regards > Shuja-ur-Rehman Baig > > >
-
RE: Including Additional JarsGuy Doulberg 2011-04-07, 07:26
Or to set the Main class in the manifest of the Jar,
-----Original Message----- From: Bill Graham [mailto:[EMAIL PROTECTED]] Sent: Wednesday, April 06, 2011 11:17 PM To: Shuja Rehman Cc: [EMAIL PROTECTED] Subject: Re: Including Additional Jars You need to pass the mainClass after the jar: http://hadoop.apache.org/common/docs/r0.21.0/commands_manual.html#jar On Wed, Apr 6, 2011 at 11:31 AM, Shuja Rehman <[EMAIL PROTECTED]> wrote: > i am using the following command > > hadoop jar myjar.jar -libjars /home/shuja/lib/mylib.jar param1 param2 > param3 > > but the program still giving the error and does not find the mylib.jar. can > u confirm the syntax of command? > thnx > > > > On Wed, Apr 6, 2011 at 8:29 PM, Bill Graham <[EMAIL PROTECTED]> wrote: >> >> If you could share more specifics regarding just how it's not working >> (i.e., job specifics, stack traces, how you're invoking it, etc), you >> might get more assistance in troubleshooting. >> >> >> On Wed, Apr 6, 2011 at 1:44 AM, Shuja Rehman <[EMAIL PROTECTED]> >> wrote: >> > -libjars is not working nor distributed cache, any other >> > solution?????????????????????????????????????????? >> > >> > On Mon, Apr 4, 2011 at 11:40 PM, James Seigel <[EMAIL PROTECTED]> wrote: >> > >> >> James’ quick and dirty, get your job running guideline: >> >> >> >> -libjars <-- for jars you want accessible by the mappers and reducers >> >> classpath or bundled in the main jar <-- for jars you want accessible >> >> to >> >> the runner >> >> >> >> Cheers >> >> James. >> >> >> >> >> >> >> >> On 2011-04-04, at 12:31 PM, Shuja Rehman wrote: >> >> >> >> > well...i think to put in distributed cache is good idea. do u have >> >> > any >> >> > working example how to put extra jars in distributed cache and how to >> >> make >> >> > available these jars for job? >> >> > Thanks >> >> > >> >> > On Mon, Apr 4, 2011 at 10:20 PM, Mark Kerzner <[EMAIL PROTECTED]> >> >> wrote: >> >> > >> >> >> I think you can put them either in your jar or in distributed cache. >> >> >> >> >> >> As Allen pointed out, my idea of putting them into hadoop lib jar >> >> >> was >> >> >> wrong. >> >> >> >> >> >> Mark >> >> >> >> >> >> On Mon, Apr 4, 2011 at 12:16 PM, Marco Didonna >> >> >> <[EMAIL PROTECTED] >> >> >>> wrote: >> >> >> >> >> >>> On 04/04/2011 07:06 PM, Allen Wittenauer wrote: >> >> >>> >> >> >>>> >> >> >>>> On Apr 4, 2011, at 8:06 AM, Shuja Rehman wrote: >> >> >>>> >> >> >>>> Hi All >> >> >>>>> >> >> >>>>> I have created a map reduce job and to run on it on the cluster, >> >> >>>>> i >> >> have >> >> >>>>> bundled all jars(hadoop, hbase etc) into single jar which >> >> >>>>> increases >> >> the >> >> >>>>> size >> >> >>>>> of overall file. During the development process, i need to copy >> >> >>>>> again >> >> >> and >> >> >>>>> again this complete file which is very time consuming so is there >> >> >>>>> any >> >> >> way >> >> >>>>> that i just copy the program jar only and do not need to copy the >> >> >>>>> lib >> >> >>>>> files >> >> >>>>> again and again. i am using net beans to develop the program. >> >> >>>>> >> >> >>>>> kindly let me know how to solve this issue? >> >> >>>>> >> >> >>>> >> >> >>>> This was in the FAQ, but in a non-obvious place. I've >> >> >>>> updated >> >> it >> >> >>>> to be more visible (hopefully): >> >> >>>> >> >> >>>> >> >> >>>> >> >> >> >> >> >> >> http://wiki.apache.org/hadoop/FAQ#How_do_I_submit_extra_content_.28jars.2C_static_files.2C_etc.29_for_my_job_to_use_during_runtime.3F >> >> >>>> >> >> >>> >> >> >>> Does the same apply to jar containing libraries? Let's suppose I >> >> >>> need >> >> >>> lucene-core.jar to run my project. Can I put my this jar into my >> >> >>> job >> >> jar >> >> >> and >> >> >>> have hadoop "see" lucene's classes? Or should I use distributed >> >> >>> cache?? >> >> >>> >> >> >>> MD >> >> >>> >> >> >>> >> >> >> >> >> > >> >> > >> >> > >> >> > -- >> >> > Regards >> >> > Shuja-ur-Rehman Baig >> >> > <http://pk.linkedin.com/in/shujamughal> |