Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> MiniCluster and "provided" scope dependencies


Copy link to this message
-
Re: MiniCluster and "provided" scope dependencies
I don't think we should do that. Artifacts shouldn't be deployed
multiple times with different POMs for different dependencies. (I'm
100% positive we'd get a scolding from Benson for that.)

The point of MAC is to test Accumulo, not Hadoop, and the additional
classifiers adds a lot of complexity to the build. I think some of
this could be improved via the accumulo-maven-plugin. You can
manipulate plugin dependencies easily enough in Maven right now, and
it would be trivial for users to override the a-m-p dependency on
hadoop-client. (http://blog.sonatype.com/people/2008/04/how-to-override-a-plugins-dependency-in-maven/)

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii
On Tue, Sep 24, 2013 at 1:20 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
> Oh, I see your point now. For hadoop 1 over hadoop 2 we would just use
> the same profiles that we have in place. We could look into using a
> classifier when deploying these artifacts so users can pull down a
> version of minicluster that is compatible with hadoop2 without forcing
> them to build it themselves.
>
> Given that we already *have* hadoop-1.x listed as the default
> dependency, I don't really see that as being an issue.
>
> On Tue, Sep 24, 2013 at 12:58 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
>> On Tue, Sep 24, 2013 at 12:48 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
>>
>>> On Tue, Sep 24, 2013 at 12:31 PM, Keith Turner <[EMAIL PROTECTED]> wrote:
>>> > On Tue, Sep 24, 2013 at 11:57 AM, Josh Elser <[EMAIL PROTECTED]>
>>> wrote:
>>> >
>>> >> I'm curious to hear what people think on this.
>>> >>
>>> >> I'm a really big fan of spinning up a minicluster instance to do some
>>> >> "more real" testing of software as I write it.
>>> >>
>>> >> With 1.5.0, it's a bit more painful because I have to add a bunch more
>>> >> dependencies to my project (which previously would only have to depend
>>> >> on the accumulo-minicluster artifact). The list includes, but is
>>> >> likely not limited to, commons-io, commons-configuration,
>>> >> hadoop-client, zookeeper, log4j, slf4j-api, slf4j-log4j12.
>>> >>
>>> >> Best as I understand it, the intent of this was that Hadoop will
>>> >> typically provide these artifacts at runtime, and therefore Accumulo
>>> >> doesn't need to re-bundle them itself which I'd agree with (not
>>> >> getting into that whole issue about the Hadoop "ecosystem"). However,
>>> >> I would think that the minicluster should have non-provided scope
>>> >> dependencies declared on these, as there is no Hadoop installation --
>>> >>
>>> >
>>> > Would this require declaring dependencies on a particular version of
>>> hadoop
>>> > in the minicluster pom?  Or could the minicluster pom have profiles for
>>> > different hadoop versions?  I do not know enough about maven to know if
>>> you
>>> > can use profiles declared in a dependency (e.g. if a user depends on
>>> > minicluster, can they activate profiles in it?)
>>>
>>> The actual dependency in minicluster is against Apache Hadoop but
>>> that's besides the point.
>>>
>>> By marking the hadoop-client dependency as provided that means that
>>> Hadoop's dependencies are *not* included at runtime (because hadoop is
>>> provided, and, as such, so are its dependencies). In other words, this
>>> is completely beside the point of what's actually included in a
>>> distribution of Hadoop when you download and install it.
>>>
>>> Apache Hadoop has dependencies we need to run minicluster. By marking
>>> the hadoop-client artifact as 'provided', we do not get its
>>> dependencies and the minicluster fails to run. I think this is easy
>>> enough to work around by overriding the dependencies we need to run
>>> the minicluster in the minicluster module (e.g. make the hadoop-client
>>> not 'provided' in the minicluster module). Thus, as we add more things
>>>
>>
>> So if we mark hadoop-client as not provided, then we have to choose a
>> version?  How easy will it be for a user to choose a different version of
>> hadoop for their testing?  I am trying to undertand what impact this would
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB