|
|
-
How to package multiple jars for a Hadoop job
Mark Kerzner 2011-02-18, 22:18
Hi,
I have a script that I use to re-package all the jars (which are output in a dist directory by NetBeans) - and it structures everything correctly into a single jar for running a MapReduce job. Here it is below, but I am not sure if it is the best practice. Besides, it hard-codes my paths. I am sure that there is a better way.
#!/bin/sh # to be run from the project directory cd ../dist jar -xf MR.jar jar -cmf META-INF/MANIFEST.MF /home/mark/MR.jar * cd ../bin echo "Repackaged for Hadoop"
Thank you, Mark
-
Re: How to package multiple jars for a Hadoop job
Eric Sammer 2011-02-18, 22:23
Mark:
You have a few options. You can:
1. Package dependent jars in a lib/ directory of the jar file. 2. Use something like Maven's assembly plugin to build a self contained jar.
Either way, I'd strongly recommend using something like Maven to build your artifacts so they're reproducible and in line with commonly used tools. Hand packaging files tends to be error prone. This is less of a Hadoop-ism and more of a general Java development issue, though.
On Fri, Feb 18, 2011 at 5:18 PM, Mark Kerzner <[EMAIL PROTECTED]> wrote:
> Hi, > > I have a script that I use to re-package all the jars (which are output in > a > dist directory by NetBeans) - and it structures everything correctly into a > single jar for running a MapReduce job. Here it is below, but I am not sure > if it is the best practice. Besides, it hard-codes my paths. I am sure that > there is a better way. > > #!/bin/sh > # to be run from the project directory > cd ../dist > jar -xf MR.jar > jar -cmf META-INF/MANIFEST.MF /home/mark/MR.jar * > cd ../bin > echo "Repackaged for Hadoop" > > Thank you, > Mark >
-- Eric Sammer twitter: esammer data: www.cloudera.com
-
Re: How to package multiple jars for a Hadoop job
Mark Kerzner 2011-02-18, 22:26
Thank you, Mark
On Fri, Feb 18, 2011 at 4:23 PM, Eric Sammer <[EMAIL PROTECTED]> wrote:
> Mark: > > You have a few options. You can: > > 1. Package dependent jars in a lib/ directory of the jar file. > 2. Use something like Maven's assembly plugin to build a self contained > jar. > > Either way, I'd strongly recommend using something like Maven to build your > artifacts so they're reproducible and in line with commonly used tools. Hand > packaging files tends to be error prone. This is less of a Hadoop-ism and > more of a general Java development issue, though. > > > On Fri, Feb 18, 2011 at 5:18 PM, Mark Kerzner <[EMAIL PROTECTED]>wrote: > >> Hi, >> >> I have a script that I use to re-package all the jars (which are output in >> a >> dist directory by NetBeans) - and it structures everything correctly into >> a >> single jar for running a MapReduce job. Here it is below, but I am not >> sure >> if it is the best practice. Besides, it hard-codes my paths. I am sure >> that >> there is a better way. >> >> #!/bin/sh >> # to be run from the project directory >> cd ../dist >> jar -xf MR.jar >> jar -cmf META-INF/MANIFEST.MF /home/mark/MR.jar * >> cd ../bin >> echo "Repackaged for Hadoop" >> >> Thank you, >> Mark >> > > > > -- > Eric Sammer > twitter: esammer > data: www.cloudera.com >
-
Re: How to package multiple jars for a Hadoop job
Jun Young Kim 2011-02-21, 02:22
hi,
There is a maven plugin to package for a hadoop. I think this is quite convenient tool to package for a hadoop.
if you are using it, add this one to your pom.xml
<plugin> <groupId>com.github.maven-hadoop.plugin</groupId> <artifactId>maven-hadoop-plugin</artifactId> <version>0.20.1</version> <configuration> <hadoopHome>your_hadoop_home_dir</hadoopHome> </configuration> </plugin>
Junyoung Kim ([EMAIL PROTECTED]) On 02/19/2011 07:23 AM, Eric Sammer wrote: > Mark: > > You have a few options. You can: > > 1. Package dependent jars in a lib/ directory of the jar file. > 2. Use something like Maven's assembly plugin to build a self contained jar. > > Either way, I'd strongly recommend using something like Maven to build your > artifacts so they're reproducible and in line with commonly used tools. Hand > packaging files tends to be error prone. This is less of a Hadoop-ism and > more of a general Java development issue, though. > > On Fri, Feb 18, 2011 at 5:18 PM, Mark Kerzner<[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I have a script that I use to re-package all the jars (which are output in >> a >> dist directory by NetBeans) - and it structures everything correctly into a >> single jar for running a MapReduce job. Here it is below, but I am not sure >> if it is the best practice. Besides, it hard-codes my paths. I am sure that >> there is a better way. >> >> #!/bin/sh >> # to be run from the project directory >> cd ../dist >> jar -xf MR.jar >> jar -cmf META-INF/MANIFEST.MF /home/mark/MR.jar * >> cd ../bin >> echo "Repackaged for Hadoop" >> >> Thank you, >> Mark >> > >
-
Re: How to package multiple jars for a Hadoop job
Mark Kerzner 2011-02-21, 02:28
Thanks!
I am using simple NetBeans scripts which I am augmenting a little, but it seems I need to use Maven anyway.
Mark
On Sun, Feb 20, 2011 at 8:22 PM, Jun Young Kim <[EMAIL PROTECTED]> wrote:
> hi, > > There is a maven plugin to package for a hadoop. > I think this is quite convenient tool to package for a hadoop. > > if you are using it, add this one to your pom.xml > > <plugin> > <groupId>com.github.maven-hadoop.plugin</groupId> > <artifactId>maven-hadoop-plugin</artifactId> > <version>0.20.1</version> > <configuration> > <hadoopHome>your_hadoop_home_dir</hadoopHome> > </configuration> > </plugin> > > Junyoung Kim ([EMAIL PROTECTED]) > > > > On 02/19/2011 07:23 AM, Eric Sammer wrote: > >> Mark: >> >> You have a few options. You can: >> >> 1. Package dependent jars in a lib/ directory of the jar file. >> 2. Use something like Maven's assembly plugin to build a self contained >> jar. >> >> Either way, I'd strongly recommend using something like Maven to build >> your >> artifacts so they're reproducible and in line with commonly used tools. >> Hand >> packaging files tends to be error prone. This is less of a Hadoop-ism and >> more of a general Java development issue, though. >> >> On Fri, Feb 18, 2011 at 5:18 PM, Mark Kerzner<[EMAIL PROTECTED]> >> wrote: >> >> Hi, >>> >>> I have a script that I use to re-package all the jars (which are output >>> in >>> a >>> dist directory by NetBeans) - and it structures everything correctly into >>> a >>> single jar for running a MapReduce job. Here it is below, but I am not >>> sure >>> if it is the best practice. Besides, it hard-codes my paths. I am sure >>> that >>> there is a better way. >>> >>> #!/bin/sh >>> # to be run from the project directory >>> cd ../dist >>> jar -xf MR.jar >>> jar -cmf META-INF/MANIFEST.MF /home/mark/MR.jar * >>> cd ../bin >>> echo "Repackaged for Hadoop" >>> >>> Thank you, >>> Mark >>> >>> >> >>
|
|