Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> How to package multiple jars for a Hadoop job


Copy link to this message
-
Re: How to package multiple jars for a Hadoop job
Thanks!

I am using simple NetBeans scripts which I am augmenting a little, but it
seems I need to use Maven anyway.

Mark

On Sun, Feb 20, 2011 at 8:22 PM, Jun Young Kim <[EMAIL PROTECTED]> wrote:

> hi,
>
> There is a maven plugin to package for a hadoop.
> I think this is quite convenient tool to package for a hadoop.
>
> if you are using it, add this one to your pom.xml
>
> <plugin>
> <groupId>com.github.maven-hadoop.plugin</groupId>
> <artifactId>maven-hadoop-plugin</artifactId>
> <version>0.20.1</version>
> <configuration>
> <hadoopHome>your_hadoop_home_dir</hadoopHome>
> </configuration>
> </plugin>
>
> Junyoung Kim ([EMAIL PROTECTED])
>
>
>
> On 02/19/2011 07:23 AM, Eric Sammer wrote:
>
>> Mark:
>>
>> You have a few options. You can:
>>
>> 1. Package dependent jars in a lib/ directory of the jar file.
>> 2. Use something like Maven's assembly plugin to build a self contained
>> jar.
>>
>> Either way, I'd strongly recommend using something like Maven to build
>> your
>> artifacts so they're reproducible and in line with commonly used tools.
>> Hand
>> packaging files tends to be error prone. This is less of a Hadoop-ism and
>> more of a general Java development issue, though.
>>
>> On Fri, Feb 18, 2011 at 5:18 PM, Mark Kerzner<[EMAIL PROTECTED]>
>>  wrote:
>>
>>  Hi,
>>>
>>> I have a script that I use to re-package all the jars (which are output
>>> in
>>> a
>>> dist directory by NetBeans) - and it structures everything correctly into
>>> a
>>> single jar for running a MapReduce job. Here it is below, but I am not
>>> sure
>>> if it is the best practice. Besides, it hard-codes my paths. I am sure
>>> that
>>> there is a better way.
>>>
>>> #!/bin/sh
>>> # to be run from the project directory
>>> cd ../dist
>>> jar -xf MR.jar
>>> jar -cmf META-INF/MANIFEST.MF  /home/mark/MR.jar *
>>> cd ../bin
>>> echo "Repackaged for Hadoop"
>>>
>>> Thank you,
>>> Mark
>>>
>>>
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB