|
Roman Shaposhnik
2011-11-11, 02:04
Stack
2011-11-11, 16:45
Alejandro Abdelnur
2011-11-11, 16:56
Roman Shaposhnik
2011-11-11, 16:58
Roman Shaposhnik
2011-11-11, 17:01
Stack
2011-11-11, 17:27
Stack
2011-11-11, 17:29
Todd Lipcon
2011-11-11, 18:32
Eric Yang
2011-11-11, 18:45
Todd Lipcon
2011-11-11, 18:54
Gary Helmling
2011-11-11, 19:04
Roman Shaposhnik
2011-11-11, 19:07
Todd Lipcon
2011-11-11, 19:14
Alejandro Abdelnur
2011-11-11, 19:40
Eric Yang
2011-11-11, 19:54
Alejandro Abdelnur
2011-11-11, 20:26
Eric Yang
2011-11-11, 21:24
Alejandro Abdelnur
2011-11-11, 21:31
Eric Yang
2011-11-11, 21:49
Alejandro Abdelnur
2011-11-11, 22:09
Eric Yang
2011-11-12, 00:38
Roman Shaposhnik
2011-11-16, 03:25
Roman Shaposhnik
2011-11-16, 03:26
Alejandro Abdelnur
2011-11-16, 17:58
Stack
2011-11-16, 18:07
Stack
2011-11-16, 18:10
Alejandro Abdelnur
2011-11-16, 19:16
Gary Helmling
2011-11-16, 19:27
Roman Shaposhnik
2011-11-18, 02:29
Alejandro Abdelnur
2011-11-18, 04:04
Stack
2011-11-18, 05:46
Stack
2011-11-18, 05:48
Elliott Clark
2011-11-18, 06:43
|
-
HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-11, 02:04
While running tests against HBase 0.92/Hadoop 0.20.205 (yes, I'm about to
report results soon ;-)) I've realized that the problem of Hadoop versions bleeding through the HBase artifacts is very real and one needs to have different maven artifacts of HBase 0.92 to run against different versions of Hadoop. So here's the question for you guys -- how do you want to deal with that? Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-11, 16:45
On Thu, Nov 10, 2011 at 6:04 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> While running tests against HBase 0.92/Hadoop 0.20.205 (yes, I'm about to > report results soon ;-)) I've realized that the problem of Hadoop versions > bleeding through the HBase artifacts is very real and one needs to have > different maven artifacts of HBase 0.92 to run against different versions of > Hadoop. > > So here's the question for you guys -- how do you want to deal with that? > We have maven profilles in our pom for different hadoop versions. This implies build per hadoop version. You thinking something else Roman? Should we ship an hbase version per hadoop version we (claim to) support? St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-11, 16:56
Hi Stack,
Are all those builds exactly the same bits? If not we may need to have different hbase versions for each one of them as they would be different maven artifacts to be published. If they are the same version we are good as developers would just have to override the version of Hadoop in their project to have precedence over the version defined in the hbase artifact (POM). Thanks. Alejandro On Fri, Nov 11, 2011 at 8:45 AM, Stack <[EMAIL PROTECTED]> wrote: > On Thu, Nov 10, 2011 at 6:04 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote: >> While running tests against HBase 0.92/Hadoop 0.20.205 (yes, I'm about to >> report results soon ;-)) I've realized that the problem of Hadoop versions >> bleeding through the HBase artifacts is very real and one needs to have >> different maven artifacts of HBase 0.92 to run against different versions of >> Hadoop. >> >> So here's the question for you guys -- how do you want to deal with that? >> > > We have maven profilles in our pom for different hadoop versions. > This implies build per hadoop version. You thinking something else > Roman? Should we ship an hbase version per hadoop version we (claim > to) support? > > St.Ack >
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-11, 16:58
On Fri, Nov 11, 2011 at 8:56 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> Hi Stack, > > Are all those builds exactly the same bits? No, they are not. That's exactly the problem I was alluding to. Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-11, 17:01
On Fri, Nov 11, 2011 at 8:45 AM, Stack <[EMAIL PROTECTED]> wrote:
> We have maven profilles in our pom for different hadoop versions. > This implies build per hadoop version. Correct. What I'm thinking is that when you build using one of those profiles you get bits which are incompatible with a different version of Hadoop. As such we need to figure out a way of making these artifacts available independently or at least converge on a single version of the artifact. The question is -- how. I could see various technical approaches to this, but I wanted to ask the project first. > Should we ship an hbase version per hadoop version we (claim > to) support? In fact having different tarballs for different distros is also part of the question. Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-11, 17:27
On Fri, Nov 11, 2011 at 8:58 AM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> On Fri, Nov 11, 2011 at 8:56 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote: >> Hi Stack, >> >> Are all those builds exactly the same bits? > > No, they are not. That's exactly the problem I was alluding to. > They are not? We were trying to use reflection to figure what our underpinnings are and then proceed accordingly. I can imagine we've missed spots but thats overall intent; i.e. we'd like the same bits work on 0.20.205.x, 0.22.x, and 0.23.x hadoops. Releasing a tarball per hadoop (multiply the number over of versions by two since we'll want to have secure and insecure hbase I believe) makes me queasy. St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-11, 17:29
On Fri, Nov 11, 2011 at 9:01 AM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> On Fri, Nov 11, 2011 at 8:45 AM, Stack <[EMAIL PROTECTED]> wrote: >> We have maven profilles in our pom for different hadoop versions. >> This implies build per hadoop version. > > Correct. What I'm thinking is that when you build using one of those > profiles you get bits which are incompatible with a different version > of Hadoop. As such we need to figure out a way of making these > artifacts available independently or at least converge on a single > version of the artifact. > Hmm... I thought it just a packaging issue; what gets assembled as the product (I could be wrong -- I've not looked at this in a while. You'd know better Roman). St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopTodd Lipcon 2011-11-11, 18:32
I agree with Stack - it only changes around dependencies, not code
modules (at this current point in time). Perhaps when you install the artifacts into a repository, though, the dependencies leak into the dependency list of the installed POM - meaning that we'd want different POMs installed based on which dependency should get pulled in. -Todd On Fri, Nov 11, 2011 at 9:29 AM, Stack <[EMAIL PROTECTED]> wrote: > On Fri, Nov 11, 2011 at 9:01 AM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote: >> On Fri, Nov 11, 2011 at 8:45 AM, Stack <[EMAIL PROTECTED]> wrote: >>> We have maven profilles in our pom for different hadoop versions. >>> This implies build per hadoop version. >> >> Correct. What I'm thinking is that when you build using one of those >> profiles you get bits which are incompatible with a different version >> of Hadoop. As such we need to figure out a way of making these >> artifacts available independently or at least converge on a single >> version of the artifact. >> > > Hmm... I thought it just a packaging issue; what gets assembled as the > product (I could be wrong -- I've not looked at this in a while. > You'd know better Roman). > > St.Ack > -- Todd Lipcon Software Engineer, Cloudera
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopEric Yang 2011-11-11, 18:45
In general, if HBase uses only the public API of Hadoop. HBase should be able to work with multiple version of Hadoop. However, the reality is not so trivial. Some Hadoop API are not fully compatible between major version of Hadoop. This was observed between HBase 0.90.x and Hadoop 0.20.203. i.e. First introduction of Hadoop Metrics V2 Framework and removed Hadoop Metrics V1 framework caused Hadoop 0.20.200-203 to be incompatible with HBase. Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop.
In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). This is the starting point to ensure that Hadoop can be swap out with a different major version for test. Once the proposed structure is adopted, HBase community can setup integration test for HBase with multiple Hadoop major release. regards, Eric On Nov 11, 2011, at 9:27 AM, Stack wrote: > On Fri, Nov 11, 2011 at 8:58 AM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote: >> On Fri, Nov 11, 2011 at 8:56 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote: >>> Hi Stack, >>> >>> Are all those builds exactly the same bits? >> >> No, they are not. That's exactly the problem I was alluding to. >> > > They are not? We were trying to use reflection to figure what our > underpinnings are and then proceed accordingly. I can imagine we've > missed spots but thats overall intent; i.e. we'd like the same bits > work on 0.20.205.x, 0.22.x, and 0.23.x hadoops. > > Releasing a tarball per hadoop (multiply the number over of versions > by two since we'll want to have secure and insecure hbase I believe) > makes me queasy. > > St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopTodd Lipcon 2011-11-11, 18:54
On Fri, Nov 11, 2011 at 10:45 AM, Eric Yang <[EMAIL PROTECTED]> wrote:
> I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. IMO this is entirely untenable. Different users upgrade HDFS at different rates - soon a lot of people will start to use 0.23 whereas many people will be running 0.20 for years to come. New versions of HBase need to be able to run against both (perhaps with some features or improvements only available on the latest). > > In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). This is the starting point to ensure that Hadoop can be swap out with a different major version for test. Once the proposed structure is adopted, HBase community can setup integration test for HBase with multiple Hadoop major release. I don't think the file layout is the major barrier here... the barrier is people actually helping to write integration tests, and working to fix bugs as they're found. For example, would very much appreciate anyone's help to get this build green: https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-23/ -Todd -- Todd Lipcon Software Engineer, Cloudera
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopGary Helmling 2011-11-11, 19:04
> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop.
Since 0.20.205.0 is the build default, a lot of the testing will naturally take place on this combination. But there are clearly others interested in (and investing a lot of testing effort in) running on 0.22 and 0.23, so we can't exclude those as unsupported. > > In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). This is good, when the packaging system supports flexible enough dependencies to allow different Hadoop versions to satisfy the package "Depends:", but I don't think it gets us all the way there. We still want to provide tarball distributions that contain a bundled Hadoop jar for easy standalone setup and testing. Maven dependencies seem to be the other limiting factor. If I setup a java program that uses the HBase client and declare that dependency, I get a transitive dependency on Hadoop (good), but what version? If I'm running Hadoop 0.22, but the published maven artifact for HBase depends on 205, can I override that dependency in my POM? Or do we need to publish separate maven artifacts for each Hadoop version, so that the dependencies for each possible combination can be met (using versioning or the version classifier)? I really don't know enough about maven dependency management. Can we specify a version like (0.20.205.0|0.22|0.23)? Or is there any way for Hadoop to do a "Provides:" on a virtual package name that those 3 can share? --gh
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-11, 19:07
On Fri, Nov 11, 2011 at 10:32 AM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> I agree with Stack - it only changes around dependencies, not code > modules (at this current point in time). Unfortunately it does bleed through into code as well. At least for hbase test artifact I can NOT run the single one against different versions of Hadoop. I'm less certain about the main artifact, but I believe I also had issues with MR2 there (although it could have been a cascading problem from a test artifact). Anyway, hbase test maven artifact is busted as far as multiple versions of MR are concerned -- that much I know for sure. Personally, I'd say that we have to fix even that. Or at least have a resolution path. May be the answer there is to provide the kind of shims layer -- that would be fine. That's why I asked a question first without suggesting a solution. > Perhaps when you install the artifacts into a repository, though, the > dependencies leak into the dependency list of the installed POM - > meaning that we'd want different POMs installed based on which > dependency should get pulled in. Yup. And that's a second major problem -- dependency leakage. Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopTodd Lipcon 2011-11-11, 19:14
On Fri, Nov 11, 2011 at 11:07 AM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> Unfortunately it does bleed through into code as well. At least for hbase test > artifact I can NOT run the single one against different versions of Hadoop. > I'm less certain about the main artifact, but I believe I also had issues > with MR2 there (although it could have been a cascading problem from > a test artifact). Ah, it might be the case that one of the upstream APIs changed from interface to abstract class, which would make incompatible HBase jars as a result... but I thought we had papered over all of those with reflection. Maybe not in the tests. >> Perhaps when you install the artifacts into a repository, though, the >> dependencies leak into the dependency list of the installed POM - >> meaning that we'd want different POMs installed based on which >> dependency should get pulled in. > > Yup. And that's a second major problem -- dependency leakage. > Is this not a common issue with maven artifacts? How do people generally deal with it? -Todd -- Todd Lipcon Software Engineer, Cloudera
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-11, 19:40
>> Yup. And that's a second major problem -- dependency leakage.
>> > Is this not a common issue with maven artifacts? How do people > generally deal with it? > Yes, this is a common issue. In Oozie we've dealt with this by excluding from Hadoop dependencies all sort of things that may come in the different versions we use. And things get worse because of incorrect dependency classification (ie junit defined as required for execution - they should be marked as scope=test), unnecessary dependencies (ie log4j bringing in JMS and Mail - they should be marked optional=true). And finally, and I thing this is the worst offender, because Hadoop does not have a client artifact, thus bringing to the client all sort of JARs that are not needed when using the client APIs. And to complicate things more, with 0.23 we are introducing several artifacts (hadoop-common, hadoop-hdfs, hadoop-mapreduce-client-*) that must be included by clients. These are different from the old hadoop-core. Thus forcing downstream projects to use (in the case of Maven projects) profiles to include one or other. A solution would be that in 0.23+ we have an umbrella hadoop-core artifact that groups all the hadoop artifacts needed by client excluding the not needed ones. If you thing this is a good idea we should move this discussion to the Hadoop alias. Thanks. Alejandro
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopEric Yang 2011-11-11, 19:54
On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. > > Since 0.20.205.0 is the build default, a lot of the testing will > naturally take place on this combination. But there are clearly > others interested in (and investing a lot of testing effort in) > running on 0.22 and 0.23, so we can't exclude those as unsupported. > >> >> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). > > This is good, when the packaging system supports flexible enough > dependencies to allow different Hadoop versions to satisfy the package > "Depends:", but I don't think it gets us all the way there. > > We still want to provide tarball distributions that contain a bundled > Hadoop jar for easy standalone setup and testing. > > Maven dependencies seem to be the other limiting factor. If I setup a > java program that uses the HBase client and declare that dependency, I > get a transitive dependency on Hadoop (good), but what version? If > I'm running Hadoop 0.22, but the published maven artifact for HBase > depends on 205, can I override that dependency in my POM? Or do we > need to publish separate maven artifacts for each Hadoop version, so > that the dependencies for each possible combination can be met (using > versioning or the version classifier)? > > I really don't know enough about maven dependency management. Can we > specify a version like (0.20.205.0|0.22|0.23)? Or is there any way > for Hadoop to do a "Provides:" on a virtual package name that those 3 > can share? Maven is quite flexible in specifying dependency. Both version range and provided can be defined in pom.xml to improve compatibility. Certification of individual version of dependent component should be expressed in the integration test phase of HBase pom.xml to ensure some version test validations can be done in HBase builds. If Provided is expressed, there is no need of virtual package, ie: <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>[0.20.205.0,)</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>[0.22.0,)</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>[0.22.0,)</version> <scope>provided</scope> </dependency> </dependencies> The packaging proposal is to ensure the produced packages are not fixed to a single version of Hadoop. It is useful for QA to run smoke test without having to make changes to scripts for release package. regards, Eric
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-11, 20:26
Eric,
One problem is that you cannot depend on hadoop-core (for pre 0.23) and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same time. Another problem is that different versions of hadoop bring in different dependencies you want to exclude, thus you have to exclude all deps from all potential hadoop versions you don't want (to complicate things more, jetty changed group name, thus you have to exclude it twice) Thanks. Alejandro On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: > > > On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: > >>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >> >> Since 0.20.205.0 is the build default, a lot of the testing will >> naturally take place on this combination. But there are clearly >> others interested in (and investing a lot of testing effort in) >> running on 0.22 and 0.23, so we can't exclude those as unsupported. >> >>> >>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >> >> This is good, when the packaging system supports flexible enough >> dependencies to allow different Hadoop versions to satisfy the package >> "Depends:", but I don't think it gets us all the way there. >> >> We still want to provide tarball distributions that contain a bundled >> Hadoop jar for easy standalone setup and testing. >> >> Maven dependencies seem to be the other limiting factor. If I setup a >> java program that uses the HBase client and declare that dependency, I >> get a transitive dependency on Hadoop (good), but what version? If >> I'm running Hadoop 0.22, but the published maven artifact for HBase >> depends on 205, can I override that dependency in my POM? Or do we >> need to publish separate maven artifacts for each Hadoop version, so >> that the dependencies for each possible combination can be met (using >> versioning or the version classifier)? >> >> I really don't know enough about maven dependency management. Can we >> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >> for Hadoop to do a "Provides:" on a virtual package name that those 3 >> can share? > > Maven is quite flexible in specifying dependency. Both version range and provided can be defined in pom.xml to improve compatibility. Certification of individual version of dependent component should be expressed in the integration test phase of HBase pom.xml to ensure some version test validations can be done in HBase builds. If Provided is expressed, there is no need of virtual package, ie: > > <dependencies> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-core</artifactId> > <version>[0.20.205.0,)</version> > <scope>provided</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-common</artifactId> > <version>[0.22.0,)</version> > <scope>provided</scope> > </dependency> > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-hdfs</artifactId> > <version>[0.22.0,)</version> > <scope>provided</scope> > </dependency> > </dependencies> > > The packaging proposal is to ensure the produced packages are not fixed to a single version of Hadoop. It is useful for QA to run smoke test without having to make changes to scripts for release package. > > regards, > Eric
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopEric Yang 2011-11-11, 21:24
This is where separated maven profiles can be useful in toggling tests with different dependency trees for test purpose only.
regards, Eric On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: > Eric, > > One problem is that you cannot depend on hadoop-core (for pre 0.23) > and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same > time. > > Another problem is that different versions of hadoop bring in > different dependencies you want to exclude, thus you have to exclude > all deps from all potential hadoop versions you don't want (to > complicate things more, jetty changed group name, thus you have to > exclude it twice) > > Thanks. > > Alejandro > > On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: >> >> >> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >> >>>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >>> >>> Since 0.20.205.0 is the build default, a lot of the testing will >>> naturally take place on this combination. But there are clearly >>> others interested in (and investing a lot of testing effort in) >>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>> >>>> >>>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >>> >>> This is good, when the packaging system supports flexible enough >>> dependencies to allow different Hadoop versions to satisfy the package >>> "Depends:", but I don't think it gets us all the way there. >>> >>> We still want to provide tarball distributions that contain a bundled >>> Hadoop jar for easy standalone setup and testing. >>> >>> Maven dependencies seem to be the other limiting factor. If I setup a >>> java program that uses the HBase client and declare that dependency, I >>> get a transitive dependency on Hadoop (good), but what version? If >>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>> depends on 205, can I override that dependency in my POM? Or do we >>> need to publish separate maven artifacts for each Hadoop version, so >>> that the dependencies for each possible combination can be met (using >>> versioning or the version classifier)? >>> >>> I really don't know enough about maven dependency management. Can we >>> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >>> for Hadoop to do a "Provides:" on a virtual package name that those 3 >>> can share? >> >> Maven is quite flexible in specifying dependency. Both version range and provided can be defined in pom.xml to improve compatibility. Certification of individual version of dependent component should be expressed in the integration test phase of HBase pom.xml to ensure some version test validations can be done in HBase builds. If Provided is expressed, there is no need of virtual package, ie: >> >> <dependencies> >> <dependency> >> <groupId>org.apache.hadoop</groupId> >> <artifactId>hadoop-core</artifactId> >> <version>[0.20.205.0,)</version> >> <scope>provided</scope> >> </dependency> >> <dependency> >> <groupId>org.apache.hadoop</groupId> >> <artifactId>hadoop-common</artifactId> >> <version>[0.22.0,)</version> >> <scope>provided</scope> >> </dependency> >> <dependency> >> <groupId>org.apache.hadoop</groupId> >> <artifactId>hadoop-hdfs</artifactId> >> <version>[0.22.0,)</version> >> <scope>provided</scope> >> </dependency> >> </dependencies> >> >> The packaging proposal is to ensure the produced packages are not fixed to a single version of Hadoop. It is useful for QA to run smoke test without having to make changes to scripts for release package.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-11, 21:31
Yes, but what version of Hadoop your published hbase artifact has? And
how do you handle the pre-0.23 and 0.23-onwards there? How the developers using hbase artifacts will deal with this? Thanks. Alejandro On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > This is where separated maven profiles can be useful in toggling tests with different dependency trees for test purpose only. > > regards, > Eric > > On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: > >> Eric, >> >> One problem is that you cannot depend on hadoop-core (for pre 0.23) >> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >> time. >> >> Another problem is that different versions of hadoop bring in >> different dependencies you want to exclude, thus you have to exclude >> all deps from all potential hadoop versions you don't want (to >> complicate things more, jetty changed group name, thus you have to >> exclude it twice) >> >> Thanks. >> >> Alejandro >> >> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: >>> >>> >>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>> >>>>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >>>> >>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>> naturally take place on this combination. But there are clearly >>>> others interested in (and investing a lot of testing effort in) >>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>> >>>>> >>>>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >>>> >>>> This is good, when the packaging system supports flexible enough >>>> dependencies to allow different Hadoop versions to satisfy the package >>>> "Depends:", but I don't think it gets us all the way there. >>>> >>>> We still want to provide tarball distributions that contain a bundled >>>> Hadoop jar for easy standalone setup and testing. >>>> >>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>> java program that uses the HBase client and declare that dependency, I >>>> get a transitive dependency on Hadoop (good), but what version? If >>>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>>> depends on 205, can I override that dependency in my POM? Or do we >>>> need to publish separate maven artifacts for each Hadoop version, so >>>> that the dependencies for each possible combination can be met (using >>>> versioning or the version classifier)? >>>> >>>> I really don't know enough about maven dependency management. Can we >>>> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >>>> for Hadoop to do a "Provides:" on a virtual package name that those 3 >>>> can share? >>> >>> Maven is quite flexible in specifying dependency. Both version range and provided can be defined in pom.xml to improve compatibility. Certification of individual version of dependent component should be expressed in the integration test phase of HBase pom.xml to ensure some version test validations can be done in HBase builds. If Provided is expressed, there is no need of virtual package, ie: >>> >>> <dependencies> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-core</artifactId> >>> <version>[0.20.205.0,)</version> >>> <scope>provided</scope> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-common</artifactId>
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopEric Yang 2011-11-11, 21:49
My recommendation is that there is no hadoop artifact in HBase, but construct from $PREFIX/share/hadoop class path. There should be a primary version of Hadoop that is advised by HBase community as officially supported. Communities like Bigtop can advertise community certified release with their patches.
regards, Eric On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote: > Yes, but what version of Hadoop your published hbase artifact has? And > how do you handle the pre-0.23 and 0.23-onwards there? How the > developers using hbase artifacts will deal with this? > > Thanks. > > Alejandro > > On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >> This is where separated maven profiles can be useful in toggling tests with different dependency trees for test purpose only. >> >> regards, >> Eric >> >> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: >> >>> Eric, >>> >>> One problem is that you cannot depend on hadoop-core (for pre 0.23) >>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >>> time. >>> >>> Another problem is that different versions of hadoop bring in >>> different dependencies you want to exclude, thus you have to exclude >>> all deps from all potential hadoop versions you don't want (to >>> complicate things more, jetty changed group name, thus you have to >>> exclude it twice) >>> >>> Thanks. >>> >>> Alejandro >>> >>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>>> >>>>>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >>>>> >>>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>>> naturally take place on this combination. But there are clearly >>>>> others interested in (and investing a lot of testing effort in) >>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>>> >>>>>> >>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >>>>> >>>>> This is good, when the packaging system supports flexible enough >>>>> dependencies to allow different Hadoop versions to satisfy the package >>>>> "Depends:", but I don't think it gets us all the way there. >>>>> >>>>> We still want to provide tarball distributions that contain a bundled >>>>> Hadoop jar for easy standalone setup and testing. >>>>> >>>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>>> java program that uses the HBase client and declare that dependency, I >>>>> get a transitive dependency on Hadoop (good), but what version? If >>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>>>> depends on 205, can I override that dependency in my POM? Or do we >>>>> need to publish separate maven artifacts for each Hadoop version, so >>>>> that the dependencies for each possible combination can be met (using >>>>> versioning or the version classifier)? >>>>> >>>>> I really don't know enough about maven dependency management. Can we >>>>> specify a version like (0.20.205.0|0.22|0.23)? Or is there any way >>>>> for Hadoop to do a "Provides:" on a virtual package name that those 3 >>>>> can share? >>>> >>>> Maven is quite flexible in specifying dependency. Both version range and provided can be defined in pom.xml to improve compatibility. Certification of individual version of dependent component should be expressed in the integration test phase of HBase pom.xml to ensure some version test validations can be done in HBase builds. If Provided is expressed, there is no need of virtual package, ie:
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-11, 22:09
Eric,
Do you mean that the HBASE published POM won't have a Hadoop artifact as a dependency? If so, the artifact will not be usable by HBASE downstream projects unless the developer adds his/her version of Hadoop explicitly. IMO this is not very kosher. It that your idea? Thanks. Alejandro On Fri, Nov 11, 2011 at 1:49 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > My recommendation is that there is no hadoop artifact in HBase, but construct from $PREFIX/share/hadoop class path. There should be a primary version of Hadoop that is advised by HBase community as officially supported. Communities like Bigtop can advertise community certified release with their patches. > > regards, > Eric > > On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote: > >> Yes, but what version of Hadoop your published hbase artifact has? And >> how do you handle the pre-0.23 and 0.23-onwards there? How the >> developers using hbase artifacts will deal with this? >> >> Thanks. >> >> Alejandro >> >> On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >>> This is where separated maven profiles can be useful in toggling tests with different dependency trees for test purpose only. >>> >>> regards, >>> Eric >>> >>> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: >>> >>>> Eric, >>>> >>>> One problem is that you cannot depend on hadoop-core (for pre 0.23) >>>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >>>> time. >>>> >>>> Another problem is that different versions of hadoop bring in >>>> different dependencies you want to exclude, thus you have to exclude >>>> all deps from all potential hadoop versions you don't want (to >>>> complicate things more, jetty changed group name, thus you have to >>>> exclude it twice) >>>> >>>> Thanks. >>>> >>>> Alejandro >>>> >>>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: >>>>> >>>>> >>>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>>>> >>>>>>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >>>>>> >>>>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>>>> naturally take place on this combination. But there are clearly >>>>>> others interested in (and investing a lot of testing effort in) >>>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>>>> >>>>>>> >>>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >>>>>> >>>>>> This is good, when the packaging system supports flexible enough >>>>>> dependencies to allow different Hadoop versions to satisfy the package >>>>>> "Depends:", but I don't think it gets us all the way there. >>>>>> >>>>>> We still want to provide tarball distributions that contain a bundled >>>>>> Hadoop jar for easy standalone setup and testing. >>>>>> >>>>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>>>> java program that uses the HBase client and declare that dependency, I >>>>>> get a transitive dependency on Hadoop (good), but what version? If >>>>>> I'm running Hadoop 0.22, but the published maven artifact for HBase >>>>>> depends on 205, can I override that dependency in my POM? Or do we >>>>>> need to publish separate maven artifacts for each Hadoop version, so >>>>>> that the dependencies for each possible combination can be met (using >>>>>> versioning or the version classifier)? >>>>>> >>>>>> I really don't know enough about maven dependency management. Can we
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopEric Yang 2011-11-12, 00:38
No, that is not what I proposed. I proposed having different test profile to test against different Hadoop releases.
Ultimately, HBase published pom will have one major version of Hadoop artifact as an dependency. Quote from my own words: >>>>>>>> I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. regards, Eric On Nov 11, 2011, at 2:09 PM, Alejandro Abdelnur wrote: > Eric, > > Do you mean that the HBASE published POM won't have a Hadoop artifact > as a dependency? > > If so, the artifact will not be usable by HBASE downstream projects > unless the developer adds his/her version of Hadoop explicitly. > > IMO this is not very kosher. > > It that your idea? > > Thanks. > > Alejandro > > On Fri, Nov 11, 2011 at 1:49 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >> My recommendation is that there is no hadoop artifact in HBase, but construct from $PREFIX/share/hadoop class path. There should be a primary version of Hadoop that is advised by HBase community as officially supported. Communities like Bigtop can advertise community certified release with their patches. >> >> regards, >> Eric >> >> On Nov 11, 2011, at 1:31 PM, Alejandro Abdelnur wrote: >> >>> Yes, but what version of Hadoop your published hbase artifact has? And >>> how do you handle the pre-0.23 and 0.23-onwards there? How the >>> developers using hbase artifacts will deal with this? >>> >>> Thanks. >>> >>> Alejandro >>> >>> On Fri, Nov 11, 2011 at 1:24 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >>>> This is where separated maven profiles can be useful in toggling tests with different dependency trees for test purpose only. >>>> >>>> regards, >>>> Eric >>>> >>>> On Nov 11, 2011, at 12:26 PM, Alejandro Abdelnur wrote: >>>> >>>>> Eric, >>>>> >>>>> One problem is that you cannot depend on hadoop-core (for pre 0.23) >>>>> and on hadoop-common/hdfs/mapreduce* (for 0.23 onwards) at the same >>>>> time. >>>>> >>>>> Another problem is that different versions of hadoop bring in >>>>> different dependencies you want to exclude, thus you have to exclude >>>>> all deps from all potential hadoop versions you don't want (to >>>>> complicate things more, jetty changed group name, thus you have to >>>>> exclude it twice) >>>>> >>>>> Thanks. >>>>> >>>>> Alejandro >>>>> >>>>> On Fri, Nov 11, 2011 at 11:54 AM, Eric Yang <[EMAIL PROTECTED]> wrote: >>>>>> >>>>>> >>>>>> On Nov 11, 2011, at 11:04 AM, Gary Helmling wrote: >>>>>> >>>>>>>> Some effort was put into restore and forward porting features to ensure HBase 0.90.x and Hadoop 0.20.205.0 can work together. I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. Perhaps when public Hadoop API are rock solid, then it will become feasible to have a version of HBase that work across multiple version of Hadoop. >>>>>>> >>>>>>> Since 0.20.205.0 is the build default, a lot of the testing will >>>>>>> naturally take place on this combination. But there are clearly >>>>>>> others interested in (and investing a lot of testing effort in) >>>>>>> running on 0.22 and 0.23, so we can't exclude those as unsupported. >>>>>>> >>>>>>>> >>>>>>>> In proposed HBase structure layout change (HBASE-4337), the packaging process excludes inclusion of Hadoop jar file, and pick up from constructed class path. In the effort of ensuring Hadoop related technology can work together in integrated fashion (File system layout change in HADOOP-6255). >>>>>>> >>>>>>> This is good, when the packaging system supports flexible enough >>>>>>> dependencies to allow different Hadoop versions to satisfy the package >>>>>>> "Depends:", but I don't think it gets us all the way there. >>>>>>> >>>>>>> We still want to provide tarball distributions that contain a bundled >>>>>>> Hadoop jar for easy standalone setup and testing. >>>>>>> >>>>>>> Maven dependencies seem to be the other limiting factor. If I setup a >>>>>>> java program that uses the HBase client and declare that dependency, I
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-16, 03:25
Sorry for reviving the dead thread, but I have, like, a real problem
to solve ;-) On Fri, Nov 11, 2011 at 2:09 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote: > Eric, > > Do you mean that the HBASE published POM won't have a Hadoop artifact > as a dependency? Ok, so lets see what the problem here really is (and also verify that my understanding is correct): 1. we have a binary Maven artifact for HBase that is pretty well insulated from the underlying Hadoop by a layer of shims and thus can work with *any* version of Hadoop selected at run time. Basically, my reading of Todd's reply is that there's no *reason* to have multiple versions of hbase-0.92.jar 2. we have a test binary artifact that is NOT insulated. Case in point: org.apache.hadoop.hbase.mapreduce.NMapInputFormat is failing because org.apache.hadoop.mapreduce.JobContext is either class or interface depending on which version of Hadoop you compile it against. Now given the above it sounds like a proper way to fix this is to provide a level of shims for tests and make them run against any version of Hadoop. Agreed? That leaves us with a single problem -- when both artifacts become hadoop agnostic we still have to put *some* version of Hadoop into our POM file as a dependency AND we have to either: 1. make that dependency optional 2. make all of the downstream exclude it Personally, I'd go with #1 since it makes things much more explicit. But what do others think? Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-16, 03:26
On Fri, Nov 11, 2011 at 4:38 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> No, that is not what I proposed. I proposed having different test profile to test against different Hadoop releases. > Ultimately, HBase published pom will have one major version of Hadoop artifact as an dependency. Quote from my own words: > >>>>>>>>> I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. I see this as neither practical, nor desirable. A proper place for any kind of stack validation is outside of HBase (be it Bigtop or your company's distribution). I'd rather see HBase itself (as a project) be as Hadoop agnostic as possible. Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-16, 17:58
While not 100% correct, IMO making Hadoop an optional dependency it
may be the way to go for Hadoop downstream projects. By doing this, the users of the downstream project can set the version of Hadoop they need and do the exclusions for that version they use without having to worry about all exclusions for all possible Hadoop versions. Thanks. Alejandro On Tue, Nov 15, 2011 at 7:25 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote: > Sorry for reviving the dead thread, but I have, like, a real problem > to solve ;-) > > On Fri, Nov 11, 2011 at 2:09 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote: >> Eric, >> >> Do you mean that the HBASE published POM won't have a Hadoop artifact >> as a dependency? > > Ok, so lets see what the problem here really is (and also verify that > my understanding is correct): > 1. we have a binary Maven artifact for HBase that is pretty well > insulated from the underlying Hadoop by a layer of shims and thus > can work with *any* version of Hadoop selected at run time. Basically, > my reading of Todd's reply is that there's no *reason* to have multiple > versions of hbase-0.92.jar > > 2. we have a test binary artifact that is NOT insulated. Case in point: > org.apache.hadoop.hbase.mapreduce.NMapInputFormat is failing because > org.apache.hadoop.mapreduce.JobContext is either class or interface > depending on which version of Hadoop you compile it against. > > Now given the above it sounds like a proper way to fix this is to provide > a level of shims for tests and make them run against any version of Hadoop. > > Agreed? > > That leaves us with a single problem -- when both artifacts become hadoop > agnostic we still have to put *some* version of Hadoop into our POM file > as a dependency AND we have to either: > 1. make that dependency optional > 2. make all of the downstream exclude it > > Personally, I'd go with #1 since it makes things much more explicit. But what > do others think? > > Thanks, > Roman. >
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-16, 18:07
On Tue, Nov 15, 2011 at 7:26 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> On Fri, Nov 11, 2011 at 4:38 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >> No, that is not what I proposed. I proposed having different test profile to test against different Hadoop releases. >> Ultimately, HBase published pom will have one major version of Hadoop artifact as an dependency. Quote from my own words: >> >>>>>>>>>> I recommend that one HBase release should be certified for one major release of Hadoop to reduce risk. > > I see this as neither practical, nor desirable. A proper place for any kind > of stack validation is outside of HBase (be it Bigtop or your company's > distribution). I'd rather see HBase itself (as a project) be as Hadoop > agnostic as possible. > I'm w/ Roman on this one. We've been trying with a good while to move away from being tied to a single hadoop version. St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-16, 18:10
On Wed, Nov 16, 2011 at 9:58 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> While not 100% correct, IMO making Hadoop an optional dependency it > may be the way to go for Hadoop downstream projects. By doing this, > the users of the downstream project can set the version of Hadoop they > need and do the exclusions for that version they use without having to > worry about all exclusions for all possible Hadoop versions. > What would this look like AA? We'd denote an Hadoop version for compile but we'd not bundle an hadoop in what we ship? Some work was done (by you?) to favor an hadoop specified by environment variables. You think we should tend this direction? What for the case where you want to do a standalone hbase install, where you want to run a single hbase instance? Good stuff, St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-16, 19:16
Stack,
IMO a Hbase release should not include Hadoop JARs. Then, it would require a Hadoop installed to run. And it would use the Hadoop 'hadoop' CLI to start. Thanks. Alejandro On Wed, Nov 16, 2011 at 10:10 AM, Stack <[EMAIL PROTECTED]> wrote: > On Wed, Nov 16, 2011 at 9:58 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote: >> While not 100% correct, IMO making Hadoop an optional dependency it >> may be the way to go for Hadoop downstream projects. By doing this, >> the users of the downstream project can set the version of Hadoop they >> need and do the exclusions for that version they use without having to >> worry about all exclusions for all possible Hadoop versions. >> > > What would this look like AA? We'd denote an Hadoop version for > compile but we'd not bundle an hadoop in what we ship? > > Some work was done (by you?) to favor an hadoop specified by > environment variables. You think we should tend this direction? > > What for the case where you want to do a standalone hbase install, > where you want to run a single hbase instance? > > Good stuff, > St.Ack >
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopGary Helmling 2011-11-16, 19:27
>
> IMO a Hbase release should not include Hadoop JARs. Then, it would > require a Hadoop installed to run. And it would use the Hadoop > 'hadoop' CLI to start. > I agree with not bundling Hadoop for RPM/deb packages where the package management system can take care of the dependencies. I'm -1 on excluding the Hadoop jar from the tarball distribution, however. Forcing newcomers to download a Hadoop distribution in addition to HBase, just to be able to start up a local _stand-alone_ HBase instance for testing is shooting ourselves in the foot as a project. We already get flak for being too complicated to run, let's not add to it. --gh
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopRoman Shaposhnik 2011-11-18, 02:29
On Wed, Nov 16, 2011 at 11:27 AM, Gary Helmling <[EMAIL PROTECTED]> wrote:
>> >> IMO a Hbase release should not include Hadoop JARs. Then, it would >> require a Hadoop installed to run. And it would use the Hadoop >> 'hadoop' CLI to start. >> > > I agree with not bundling Hadoop for RPM/deb packages where the > package management system can take care of the dependencies. > > I'm -1 on excluding the Hadoop jar from the tarball distribution, > however. Forcing newcomers to download a Hadoop distribution in > addition to HBase, just to be able to start up a local _stand-alone_ > HBase instance for testing is shooting ourselves in the foot as a > project. We already get flak for being too complicated to run, let's > not add to it. Ok, so it sounds like we have a consensus on: 1. make tests MR-agnostic 2. making Hadoop an optional dependency with the lowest (current version of Hadoop being the default). 3. keeping Hadoop jar as part of the tarball dsitro I'll open up JIRAs accordingly. Thanks for the feedback! Thanks, Roman.
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopAlejandro Abdelnur 2011-11-18, 04:04
On Thu, Nov 17, 2011 at 6:29 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
>... > 3. keeping Hadoop jar as part of the tarball dsitro Do we want to have an hbase-developer tarball (with a chosen version of Hadoop) and a hbase-bin tarball without it, mirroring what OS packages will have? Also, the primary release tarball is a hbase-src tarball which is a mirror of the source, buildable, the binary tarballs are convenience tarballs, correct? Thanks. Alejandro
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-18, 05:46
On Thu, Nov 17, 2011 at 6:29 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote:
> On Wed, Nov 16, 2011 at 11:27 AM, Gary Helmling <[EMAIL PROTECTED]> wrote: >>> >>> IMO a Hbase release should not include Hadoop JARs. Then, it would >>> require a Hadoop installed to run. And it would use the Hadoop >>> 'hadoop' CLI to start. >>> >> >> I agree with not bundling Hadoop for RPM/deb packages where the >> package management system can take care of the dependencies. >> >> I'm -1 on excluding the Hadoop jar from the tarball distribution, >> however. Forcing newcomers to download a Hadoop distribution in >> addition to HBase, just to be able to start up a local _stand-alone_ >> HBase instance for testing is shooting ourselves in the foot as a >> project. We already get flak for being too complicated to run, let's >> not add to it. > > Ok, so it sounds like we have a consensus on: > 1. make tests MR-agnostic > 2. making Hadoop an optional dependency with the lowest (current > version of Hadoop being the default). > 3. keeping Hadoop jar as part of the tarball dsitro > > I'll open up JIRAs accordingly. Thanks for the feedback! > Sounds reasonable. St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopStack 2011-11-18, 05:48
On Thu, Nov 17, 2011 at 8:04 PM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> On Thu, Nov 17, 2011 at 6:29 PM, Roman Shaposhnik <[EMAIL PROTECTED]> wrote: >>... >> 3. keeping Hadoop jar as part of the tarball dsitro > > Do we want to have an hbase-developer tarball (with a chosen version > of Hadoop) and a hbase-bin tarball without it, mirroring what OS > packages will have? > This sounds complicated? > Also, the primary release tarball is a hbase-src tarball which is a > mirror of the source, buildable, the binary tarballs are convenience > tarballs, correct? > Up to now we've done what hadoop does shipping bin+src in the one bundle. I think we probably want to keep up the ability to download and launch (w/o requiring you build in between download and launch). St.Ack
-
Re: HBase .92 maven artifacts compiled against different releases of HadoopElliott Clark 2011-11-18, 06:43
On Thu, Nov 17, 2011 at 9:48 PM, Stack <[EMAIL PROTECTED]> wrote:
> > Up to now we've done what hadoop does shipping bin+src in the one > bundle. I think we probably want to keep up the ability to download > and launch (w/o requiring you build in between download and launch). > > St.Ack > Please do. For automated deploys the ability to push out the packaged tar.gz's is a lifesaver. |