|
yuzhihong@...
2011-12-12, 14:43
Stack
2011-12-12, 23:43
yuzhihong@...
2011-12-12, 23:50
Todd Lipcon
2011-12-13, 00:03
Andrew Purtell
2011-12-13, 00:30
Andrew Purtell
2011-12-13, 00:36
Todd Lipcon
2011-12-13, 00:55
yuzhihong@...
2011-12-13, 01:03
Andrew Purtell
2011-12-13, 01:12
Nicolas Spiegelberg
2011-12-13, 03:12
Andrew Purtell
2011-12-13, 06:14
Stack
2011-12-13, 06:24
Andrew Purtell
2011-12-13, 06:34
lars hofhansl
2011-12-13, 07:23
yuzhihong@...
2011-12-13, 08:42
Jonathan Hsieh
2011-12-13, 19:57
Stack
2011-12-13, 20:53
Stack
2011-12-13, 16:57
Todd Lipcon
2011-12-13, 01:11
yuzhihong@...
2011-12-13, 00:35
|
-
Code review request for hbase-4120 table priorityyuzhihong@... 2011-12-12, 14:43
Hi,
4120 has gone through more than 20 revisions. Please provide your comments. I plan to integrate it this week. Thanks +
yuzhihong@... 2011-12-12, 14:43
-
Re: Code review request for hbase-4120 table priorityStack 2011-12-12, 23:43
On Mon, Dec 12, 2011 at 6:43 AM, <[EMAIL PROTECTED]> wrote:
> Hi, > 4120 has gone through more than 20 revisions. > > Please provide your comments. > > I plan to integrate it this week. > I'd suggest hold on commit until some other committers have had a looksee. This is an important feature that we need to get right and there is no need to rush it in. Thanks Ted (and thanks for the reviews so far), St.Ack +
Stack 2011-12-12, 23:43
-
Re: Code review request for hbase-4120 table priorityyuzhihong@... 2011-12-12, 23:50
Waiting for review comments from other committers.
The implementation is pluggable by using coprocessors. Cheers On Dec 12, 2011, at 5:43 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, Dec 12, 2011 at 6:43 AM, <[EMAIL PROTECTED]> wrote: >> Hi, >> 4120 has gone through more than 20 revisions. >> >> Please provide your comments. >> >> I plan to integrate it this week. >> > > I'd suggest hold on commit until some other committers have had a > looksee. This is an important feature that we need to get right and > there is no need to rush it in. > > Thanks Ted (and thanks for the reviews so far), > St.Ack +
yuzhihong@... 2011-12-12, 23:50
-
Re: Code review request for hbase-4120 table priorityTodd Lipcon 2011-12-13, 00:03
If it's completely a coprocessor, then it seems we should let it bake
on github and only incorporate in core if we find that a number of the core HBase users are using it in production. Am I misunderstanding the implementation? (haven't looked at the most recent patch) -Todd On Mon, Dec 12, 2011 at 3:50 PM, <[EMAIL PROTECTED]> wrote: > Waiting for review comments from other committers. > The implementation is pluggable by using coprocessors. > > Cheers > > > > On Dec 12, 2011, at 5:43 PM, Stack <[EMAIL PROTECTED]> wrote: > >> On Mon, Dec 12, 2011 at 6:43 AM, <[EMAIL PROTECTED]> wrote: >>> Hi, >>> 4120 has gone through more than 20 revisions. >>> >>> Please provide your comments. >>> >>> I plan to integrate it this week. >>> >> >> I'd suggest hold on commit until some other committers have had a >> looksee. This is an important feature that we need to get right and >> there is no need to rush it in. >> >> Thanks Ted (and thanks for the reviews so far), >> St.Ack -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2011-12-13, 00:03
-
Re: Code review request for hbase-4120 table priorityAndrew Purtell 2011-12-13, 00:30
HBASE-4120 deals with only the RPC prioritization parts. This cannot be implemented as a coprocessor.
> If it's completely a coprocessor, then it seems we should let it bake > on github and only incorporate in core if we find that a number of the > core HBase users are using it in production. The addition of the coprocessor framework is from my point of view a double edged sword. It can be a tool to clarify core and reduce maintenance burden on the community for various boutique functions. On the other hand it can be used to freeze further feature development and leave users who need such features out in the cold to go their own way or make a private fork. I've become aware of several private forks of HDFS and HBase. Too bad. A pooling of dev resources would have almost surely have been better. HBASE-4120 and successor issues are / will be an attempt to take what runs in production at Taobao and upstream it. They could just do a code drop on GitHub of what they have now. Would be much easier than reimplementing it as a coprocessor. However they have incentive here to give back to upstream, additional co-development with the community, for inclusion of their work in the distribution. What is the incentive if we want them to make the reimplementation effort only to then just drop that on GitHub? There are some cases where it would benefit all users to include coprocessor-based features in tree: Like security. Perhaps like constraints. Like the isolation/allocation stuff, for perhaps 0.94. Perhaps secondary indexing. In all of these cases, though the majority of the implementation is as coprocessor, there must be some changes to core -- to the coprocessor framework, generally -- to support it. A coprocessor can be included in the tree as a maven module. Without such inclusion, the rationale for not dropping the enabling logic in core may be lacking ... "oh, this just supports some GitHub thing". Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) ----- Original Message ----- > From: Todd Lipcon <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Cc: > Sent: Monday, December 12, 2011 4:03 PM > Subject: Re: Code review request for hbase-4120 table priority > > If it's completely a coprocessor, then it seems we should let it bake > on github and only incorporate in core if we find that a number of the > core HBase users are using it in production. Am I misunderstanding the > implementation? (haven't looked at the most recent patch) > > -Todd > > On Mon, Dec 12, 2011 at 3:50 PM, <[EMAIL PROTECTED]> wrote: >> Waiting for review comments from other committers. >> The implementation is pluggable by using coprocessors. >> >> Cheers >> >> >> >> On Dec 12, 2011, at 5:43 PM, Stack <[EMAIL PROTECTED]> wrote: >> >>> On Mon, Dec 12, 2011 at 6:43 AM, <[EMAIL PROTECTED]> wrote: >>>> Hi, >>>> 4120 has gone through more than 20 revisions. >>>> >>>> Please provide your comments. >>>> >>>> I plan to integrate it this week. >>>> >>> >>> I'd suggest hold on commit until some other committers have had a >>> looksee. This is an important feature that we need to get right and >>> there is no need to rush it in. >>> >>> Thanks Ted (and thanks for the reviews so far), >>> St.Ack > > > > -- > Todd Lipcon > Software Engineer, Cloudera > +
Andrew Purtell 2011-12-13, 00:30
-
Re: Code review request for hbase-4120 table priorityAndrew Purtell 2011-12-13, 00:36
By the way, -1 to this as a criteria:
> If it's completely a coprocessor, then it seems we should let it bake > on github and only incorporate in core if we find that a number of the > core HBase users are using it in production. HBase as a project should not have as a criteria for inclusion of some feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. Let's not run history in reverse here in HBase land. And, actually, this makes it worse, because the the occurrence that a number of core HBase users (multiple) will all need something is substantially less likely than if one might find it useful; or, maybe, only users outside of those with such self-appointed attitude, yet perhaps a community multiples in size of "core users". Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) ----- Original Message ----- > From: Andrew Purtell <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: > Sent: Monday, December 12, 2011 4:30 PM > Subject: Re: Code review request for hbase-4120 table priority > > HBASE-4120 deals with only the RPC prioritization parts. This cannot be > implemented as a coprocessor. > >> If it's completely a coprocessor, then it seems we should let it bake >> on github and only incorporate in core if we find that a number of the >> core HBase users are using it in production. > > > The addition of the coprocessor framework is from my point of view a double > edged sword. It can be a tool to clarify core and reduce maintenance burden on > the community for various boutique functions. On the other hand it can be used > to freeze further feature development and leave users who need such features out > in the cold to go their own way or make a private fork. > > I've become aware of several private forks of HDFS and HBase. Too bad. A > pooling of dev resources would have almost surely have been better. > > HBASE-4120 and successor issues are / will be an attempt to take what runs in > production at Taobao and upstream it. They could just do a code drop on GitHub > of what they have now. Would be much easier than reimplementing it as a > coprocessor. However they have incentive here to give back to upstream, > additional co-development with the community, for inclusion of their work in the > distribution. What is the incentive if we want them to make the reimplementation > effort only to then just drop that on GitHub? > > There are some cases where it would benefit all users to include > coprocessor-based features in tree: Like security. Perhaps like constraints. > Like the isolation/allocation stuff, for perhaps 0.94. Perhaps secondary > indexing. In all of these cases, though the majority of the implementation is as > coprocessor, there must be some changes to core -- to the coprocessor framework, > generally -- to support it. A coprocessor can be included in the tree as a maven > module. Without such inclusion, the rationale for not dropping the enabling > logic in core may be lacking ... "oh, this just supports some GitHub > thing". > > Best regards, > > > - Andy > > Problems worthy of attack prove their worth by hitting back. - Piet Hein (via > Tom White) > > > ----- Original Message ----- >> From: Todd Lipcon <[EMAIL PROTECTED]> >> To: [EMAIL PROTECTED] >> Cc: >> Sent: Monday, December 12, 2011 4:03 PM >> Subject: Re: Code review request for hbase-4120 table priority >> >> If it's completely a coprocessor, then it seems we should let it bake >> on github and only incorporate in core if we find that a number of the >> core HBase users are using it in production. Am I misunderstanding the >> implementation? (haven't looked at the most recent patch) >> >> -Todd >> >> On Mon, Dec 12, 2011 at 3:50 PM, <[EMAIL PROTECTED]> wrote: >>> Waiting for review comments from other committers. >>> The implementation is pluggable by using coprocessors. +
Andrew Purtell 2011-12-13, 00:36
-
Re: Code review request for hbase-4120 table priorityTodd Lipcon 2011-12-13, 00:55
On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> > HBase as a project should not have as a criteria for inclusion of some feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. Let's not run history in reverse here in HBase land. And, actually, this makes it worse, because the the occurrence that a number of core HBase users (multiple) will all need something is substantially less likely than if one might find it useful; or, maybe, only users outside of those with such self-appointed attitude, yet perhaps a community multiples in size of "core users". It's not about Cloudera/SU/FB - it's about code that will be supported by people who are committed to the project. TrendMicro certainly fits the bill. I of course mean no offense to Lu Jia, but neither he nor Taobao has made continued contributions in the past - just one other bug fix beyond the HBASE-4120 project. If we have a few of the core people committed to running this in production and supporting it in the future, I'm all for it (just like I am +1 on security). I just want to avoid repeating mistakes like the Avro server which isn't really supported despite being in our codebase. (You'll note this was a Cloudera contribution but from a contributor who was doing this in his spare time rather than part of job responsibilities, and we have never run it in production scenarios) I am consistently conservative on what goes into the project because we have to stand behind what we release. I certainly don't think _all_ core people should find every feature useful (eg REST and Thrift are examples of some things which are useless to many but I think make sense). But if _no_ core people see a feature as a requirement then I'd rather let it bake until we have many people requesting it. Otherwise people download HBase, try out these "fringe" features, and get a bad taste in their mouth when they've bit-rot across several versions of little usage. -Todd -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2011-12-13, 00:55
-
Re: Code review request for hbase-4120 table priorityyuzhihong@... 2011-12-13, 01:03
I will be supporting table priority and related changes in the foreseeable future.
Cheers On Dec 12, 2011, at 6:55 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> >> HBase as a project should not have as a criteria for inclusion of some feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. Let's not run history in reverse here in HBase land. And, actually, this makes it worse, because the the occurrence that a number of core HBase users (multiple) will all need something is substantially less likely than if one might find it useful; or, maybe, only users outside of those with such self-appointed attitude, yet perhaps a community multiples in size of "core users". > > It's not about Cloudera/SU/FB - it's about code that will be supported > by people who are committed to the project. TrendMicro certainly fits > the bill. I of course mean no offense to Lu Jia, but neither he nor > Taobao has made continued contributions in the past - just one other > bug fix beyond the HBASE-4120 project. > > If we have a few of the core people committed to running this in > production and supporting it in the future, I'm all for it (just like > I am +1 on security). I just want to avoid repeating mistakes like the > Avro server which isn't really supported despite being in our > codebase. (You'll note this was a Cloudera contribution but from a > contributor who was doing this in his spare time rather than part of > job responsibilities, and we have never run it in production > scenarios) > > I am consistently conservative on what goes into the project because > we have to stand behind what we release. I certainly don't think _all_ > core people should find every feature useful (eg REST and Thrift are > examples of some things which are useless to many but I think make > sense). But if _no_ core people see a feature as a requirement then > I'd rather let it bake until we have many people requesting it. > Otherwise people download HBase, try out these "fringe" features, and > get a bad taste in their mouth when they've bit-rot across several > versions of little usage. > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera +
yuzhihong@... 2011-12-13, 01:03
-
Re: Code review request for hbase-4120 table priorityAndrew Purtell 2011-12-13, 01:12
Hi Todd,
> It's not about Cloudera/SU/FB - it's about code that will be supported > by people who are committed to the project. Fine, but that is not what you stated exactly, so I felt it important to -1 the language in your previous email. > If we have a few of the core people committed to running this in > production and supporting it in the future, I'm all for it (just like > I am +1 on security). I just want to avoid repeating mistakes like the > Avro server which isn't really supported despite being in our > codebase. My point of view here is that what you said is fine, but it would have been better if you stopped at "If we have a few of the core people committed". I would really like to see the isolation feature in 0.94+, so I intend to work with Jia and, if there is a successful result, support it going forward in a manner like Stargate, even though I may not run it personally (like I don't run Stargate). I take an expansive view of what open source projects should accept... > I just want to avoid repeating mistakes like the > Avro server which isn't really supported despite being in our > codebase. ... even though it may lead to situations like that. On the other hand there are users and committers of HBase for whom stability is paramount. I expect the tension :-); it will make us a healthier project. But we cannot have a new feature acceptance criteria that requires several "core users" run it in production. Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) ----- Original Message ----- > From: Todd Lipcon <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> > Cc: > Sent: Monday, December 12, 2011 4:55 PM > Subject: Re: Code review request for hbase-4120 table priority > > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: >> >> HBase as a project should not have as a criteria for inclusion of some > feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. > Let's not run history in reverse here in HBase land. And, actually, this > makes it worse, because the the occurrence that a number of core HBase users > (multiple) will all need something is substantially less likely than if one > might find it useful; or, maybe, only users outside of those with such > self-appointed attitude, yet perhaps a community multiples in size of "core > users". > > It's not about Cloudera/SU/FB - it's about code that will be supported > by people who are committed to the project. TrendMicro certainly fits > the bill. I of course mean no offense to Lu Jia, but neither he nor > Taobao has made continued contributions in the past - just one other > bug fix beyond the HBASE-4120 project. > > If we have a few of the core people committed to running this in > production and supporting it in the future, I'm all for it (just like > I am +1 on security). I just want to avoid repeating mistakes like the > Avro server which isn't really supported despite being in our > codebase. (You'll note this was a Cloudera contribution but from a > contributor who was doing this in his spare time rather than part of > job responsibilities, and we have never run it in production > scenarios) > > I am consistently conservative on what goes into the project because > we have to stand behind what we release. I certainly don't think _all_ > core people should find every feature useful (eg REST and Thrift are > examples of some things which are useless to many but I think make > sense). But if _no_ core people see a feature as a requirement then > I'd rather let it bake until we have many people requesting it. > Otherwise people download HBase, try out these "fringe" features, and > get a bad taste in their mouth when they've bit-rot across several > versions of little usage. > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera > +
Andrew Purtell 2011-12-13, 01:12
-
Re: Code review request for hbase-4120 table priorityNicolas Spiegelberg 2011-12-13, 03:12
>I would really like to see the isolation feature in 0.94+, so I intend to
>work with Jia and, if there is a successful result, support it going >forward in a manner like Stargate, even though I may not run it >personally (like I don't run Stargate). I take an expansive view of what >open source projects should accept... > >On the other hand there are users and committers of HBase for whom >stability is paramount. I expect the tension :-); it will make us a >healthier project. > >But we cannot have a new feature acceptance criteria that requires >several "core users" run it in production. I think we should be really careful here because a laissez-fare attitude will keep the project from reaching a new level of maturity. There are a number of committers from various companies who are using this for business-critical applications, want to provide users with a build that they can use for business-critical applications, and independently coming to the same conclusion that we need a more stable trunk. In fact, it was important enough for us to discuss in depth at the HBase pow-pow a couple weeks ago, with 30+? people. There are other open source projects out there that aren't used for as critical data and would love to have that next feature that might differentiate them. As a maturing product, we don't need a ton of new features as much as we need our features to work properly and have some well-thought tweaks for completeness. That said, this thread has been split between a discussion about HBASE-4120 and a discussion about open source inclusivity. I appreciate the effort that Tao Bao is putting into HBASE-4120. I'm glad that they have plans to use this in a production environment. I'm glad to hear that there are active companies that also want to use this feature in a critical environment. That said, I gave an initial scan for HBASE-4120 and found race conditions & necessary design refactoring. This was just an introductory scan, so I didn't look very hard and expected to find more issues in a later scan. I have not looked at the diff recently because of prod release issues; but I find it disturbing that other reviewers didn't find those issues, since I waited over a month to review that patch. I would expect that a committer wanting to use this feature in his company for production would do a much more thorough analysis than I did. +
Nicolas Spiegelberg 2011-12-13, 03:12
-
Re: Code review request for hbase-4120 table priorityAndrew Purtell 2011-12-13, 06:14
> I
> would expect that a committer wanting to use this feature in his company > for production would do a much more thorough analysis than I did. I haven't looked at the patch yet. This is early, would be for 0.94 and 0.92 isn't even out yet, etc. Why make this personal? > There are other open source projects out > there that aren't used for as critical data and would love to have that > next feature that might differentiate them. Message received. A new level of "maturity". Go elsewhere. Best regards, - Andy On Dec 12, 2011, at 7:12 PM, Nicolas Spiegelberg <[EMAIL PROTECTED]> wrote: >> I would really like to see the isolation feature in 0.94+, so I intend to >> work with Jia and, if there is a successful result, support it going >> forward in a manner like Stargate, even though I may not run it >> personally (like I don't run Stargate). I take an expansive view of what >> open source projects should accept... >> >> On the other hand there are users and committers of HBase for whom >> stability is paramount. I expect the tension :-); it will make us a >> healthier project. >> >> But we cannot have a new feature acceptance criteria that requires >> several "core users" run it in production. > > I think we should be really careful here because a laissez-fare attitude > will keep the project from reaching a new level of maturity. There are a > number of committers from various companies who are using this for > business-critical applications, want to provide users with a build that > they can use for business-critical applications, and independently coming > to the same conclusion that we need a more stable trunk. In fact, it was > important enough for us to discuss in depth at the HBase pow-pow a couple > weeks ago, with 30+? people. There are other open source projects out > there that aren't used for as critical data and would love to have that > next feature that might differentiate them. As a maturing product, we > don't need a ton of new features as much as we need our features to work > properly and have some well-thought tweaks for completeness. > > That said, this thread has been split between a discussion about > HBASE-4120 and a discussion about open source inclusivity. > > I appreciate the effort that Tao Bao is putting into HBASE-4120. I'm glad > that they have plans to use this in a production environment. I'm glad to > hear that there are active companies that also want to use this feature in > a critical environment. That said, I gave an initial scan for HBASE-4120 > and found race conditions & necessary design refactoring. This was just > an introductory scan, so I didn't look very hard and expected to find more > issues in a later scan. I have not looked at the diff recently because of > prod release issues; but I find it disturbing that other reviewers didn't > find those issues, since I waited over a month to review that patch. I > would expect that a committer wanting to use this feature in his company > for production would do a much more thorough analysis than I did. > +
Andrew Purtell 2011-12-13, 06:14
-
Re: Code review request for hbase-4120 table priorityStack 2011-12-13, 06:24
On Mon, Dec 12, 2011 at 10:14 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
>> There are other open source projects out >> there that aren't used for as critical data and would love to have that >> next feature that might differentiate them. > > Message received. A new level of "maturity". Go elsewhere. > Am I reading the above correctly Andrew when I interpret it as: "For a stable project, go elsewhere?" St.Ack +
Stack 2011-12-13, 06:24
-
Re: Code review request for hbase-4120 table priorityAndrew Purtell 2011-12-13, 06:34
No that's not correct.
I believe I was told off by Nicolas and Taobao should look to contribute elsewhere, new features to some other open source project. On Dec 12, 2011, at 10:24 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, Dec 12, 2011 at 10:14 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >>> There are other open source projects out >>> there that aren't used for as critical data and would love to have that >>> next feature that might differentiate them. >> >> Message received. A new level of "maturity". Go elsewhere. >> > > Am I reading the above correctly Andrew when I interpret it as: "For a > stable project, go elsewhere?" > St.Ack +
Andrew Purtell 2011-12-13, 06:34
-
Re: Code review request for hbase-4120 table prioritylars hofhansl 2011-12-13, 07:23
While I haven't looked (in depth) at the patch, yet, this is definitely a feature that will be extremely helpful
for Salesforce's multitenant architecture to isolate tenants and services from each other. While we don't have HBase in our production data centers, yet (working on it), I am certain that we will use this feature eventually. Would it help to break the patch into multiple smaller patches? Off the bat I think of: 1. the grouping logic 2. regionserver configuration (caching, etc) per group 3. table priorities 4. etc... (folks who have actually looked at the patch can probably identify better demarcations between the aspects of this change.) That would certainly make it more manageable for me - personally - to review the code. -- Lars ----- Original Message ----- From: Todd Lipcon <[EMAIL PROTECTED]> To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> Cc: Sent: Monday, December 12, 2011 4:55 PM Subject: Re: Code review request for hbase-4120 table priority On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: > > HBase as a project should not have as a criteria for inclusion of some feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. Let's not run history in reverse here in HBase land. And, actually, this makes it worse, because the the occurrence that a number of core HBase users (multiple) will all need something is substantially less likely than if one might find it useful; or, maybe, only users outside of those with such self-appointed attitude, yet perhaps a community multiples in size of "core users". It's not about Cloudera/SU/FB - it's about code that will be supported by people who are committed to the project. TrendMicro certainly fits the bill. I of course mean no offense to Lu Jia, but neither he nor Taobao has made continued contributions in the past - just one other bug fix beyond the HBASE-4120 project. If we have a few of the core people committed to running this in production and supporting it in the future, I'm all for it (just like I am +1 on security). I just want to avoid repeating mistakes like the Avro server which isn't really supported despite being in our codebase. (You'll note this was a Cloudera contribution but from a contributor who was doing this in his spare time rather than part of job responsibilities, and we have never run it in production scenarios) I am consistently conservative on what goes into the project because we have to stand behind what we release. I certainly don't think _all_ core people should find every feature useful (eg REST and Thrift are examples of some things which are useless to many but I think make sense). But if _no_ core people see a feature as a requirement then I'd rather let it bake until we have many people requesting it. Otherwise people download HBase, try out these "fringe" features, and get a bad taste in their mouth when they've bit-rot across several versions of little usage. -Todd -- Todd Lipcon Software Engineer, Cloudera +
lars hofhansl 2011-12-13, 07:23
-
Re: Code review request for hbase-4120 table priorityyuzhihong@... 2011-12-13, 08:42
Thanks for the suggestion, Lars.
The original scope for 4120 is bigger than the latest patch which only covers table priorities. Let's perform more reviews for the current patch. We can create more subtasks for the umbrella feature. Cheers On Dec 12, 2011, at 11:23 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > While I haven't looked (in depth) at the patch, yet, this is definitely a feature that will be extremely helpful > for Salesforce's multitenant architecture to isolate tenants and services from each other. > > While we don't have HBase in our production data centers, yet (working on it), I am certain that we will use this feature > eventually. > > Would it help to break the patch into multiple smaller patches? > > Off the bat I think of: > 1. the grouping logic > 2. regionserver configuration (caching, etc) per group > 3. table priorities > 4. etc... (folks who have actually looked at the patch can probably identify better demarcations between the aspects of this change.) > > That would certainly make it more manageable for me - personally - to review the code. > > -- Lars > > > ----- Original Message ----- > From: Todd Lipcon <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> > Cc: > Sent: Monday, December 12, 2011 4:55 PM > Subject: Re: Code review request for hbase-4120 table priority > > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote: >> >> HBase as a project should not have as a criteria for inclusion of some feature that Cloudera and SU and Facebook run it. Core managed to escape Yahoo. Let's not run history in reverse here in HBase land. And, actually, this makes it worse, because the the occurrence that a number of core HBase users (multiple) will all need something is substantially less likely than if one might find it useful; or, maybe, only users outside of those with such self-appointed attitude, yet perhaps a community multiples in size of "core users". > > It's not about Cloudera/SU/FB - it's about code that will be supported > by people who are committed to the project. TrendMicro certainly fits > the bill. I of course mean no offense to Lu Jia, but neither he nor > Taobao has made continued contributions in the past - just one other > bug fix beyond the HBASE-4120 project. > > If we have a few of the core people committed to running this in > production and supporting it in the future, I'm all for it (just like > I am +1 on security). I just want to avoid repeating mistakes like the > Avro server which isn't really supported despite being in our > codebase. (You'll note this was a Cloudera contribution but from a > contributor who was doing this in his spare time rather than part of > job responsibilities, and we have never run it in production > scenarios) > > I am consistently conservative on what goes into the project because > we have to stand behind what we release. I certainly don't think _all_ > core people should find every feature useful (eg REST and Thrift are > examples of some things which are useless to many but I think make > sense). But if _no_ core people see a feature as a requirement then > I'd rather let it bake until we have many people requesting it. > Otherwise people download HBase, try out these "fringe" features, and > get a bad taste in their mouth when they've bit-rot across several > versions of little usage. > > -Todd > -- > Todd Lipcon > Software Engineer, Cloudera > +
yuzhihong@... 2011-12-13, 08:42
-
Re: Code review request for hbase-4120 table priorityJonathan Hsieh 2011-12-13, 19:57
Note: I've only done a quick look at the jira and the code. The high level
design document/approach seems reasonable and I think most agree that this is a useful feature and that a lot of effort has gone into it. The feature is off by default -- I can see one main difference in this situation compared to other major newish generally-considered experimental or incomplete features (replication, off-heap slab cache, online schema changes). This feature doesn't have one of the current HBase committers using/testing it in their production environments or in their test environment. This seems perfect for *a feature branch* as we talked briefly about at the Pow-wow. There seem to be some problems identified that will result in follow on issues (races mentioned). Using a branch would: * make it available at apache allows devs to test it * allows a committer who is championing this to test it by using it more and to iron out glaring problems in environment, * encourages and shepards the contributor allowing them to justify continued effort, * allows all of us to defer the decision to fold the feature into 0.94 (or 0.96, or later) when more folks are familiar or comfortable with it. Who knows, maybe some of the TaoBao folks will eventually become committers. Jon. On Tue, Dec 13, 2011 at 12:42 AM, <[EMAIL PROTECTED]> wrote: > Thanks for the suggestion, Lars. > The original scope for 4120 is bigger than the latest patch which only > covers table priorities. > > Let's perform more reviews for the current patch. We can create more > subtasks for the umbrella feature. > > Cheers > > > > On Dec 12, 2011, at 11:23 PM, lars hofhansl <[EMAIL PROTECTED]> wrote: > > > While I haven't looked (in depth) at the patch, yet, this is definitely > a feature that will be extremely helpful > > for Salesforce's multitenant architecture to isolate tenants and > services from each other. > > > > While we don't have HBase in our production data centers, yet (working > on it), I am certain that we will use this feature > > eventually. > > > > Would it help to break the patch into multiple smaller patches? > > > > Off the bat I think of: > > 1. the grouping logic > > 2. regionserver configuration (caching, etc) per group > > 3. table priorities > > 4. etc... (folks who have actually looked at the patch can probably > identify better demarcations between the aspects of this change.) > > > > That would certainly make it more manageable for me - personally - to > review the code. > > > > -- Lars > > > > > > ----- Original Message ----- > > From: Todd Lipcon <[EMAIL PROTECTED]> > > To: [EMAIL PROTECTED]; Andrew Purtell <[EMAIL PROTECTED]> > > Cc: > > Sent: Monday, December 12, 2011 4:55 PM > > Subject: Re: Code review request for hbase-4120 table priority > > > > On Mon, Dec 12, 2011 at 4:36 PM, Andrew Purtell <[EMAIL PROTECTED]> > wrote: > >> > >> HBase as a project should not have as a criteria for inclusion of some > feature that Cloudera and SU and Facebook run it. Core managed to escape > Yahoo. Let's not run history in reverse here in HBase land. And, actually, > this makes it worse, because the the occurrence that a number of core HBase > users (multiple) will all need something is substantially less likely than > if one might find it useful; or, maybe, only users outside of those with > such self-appointed attitude, yet perhaps a community multiples in size of > "core users". > > > > It's not about Cloudera/SU/FB - it's about code that will be supported > > by people who are committed to the project. TrendMicro certainly fits > > the bill. I of course mean no offense to Lu Jia, but neither he nor > > Taobao has made continued contributions in the past - just one other > > bug fix beyond the HBASE-4120 project. > > > > If we have a few of the core people committed to running this in > > production and supporting it in the future, I'm all for it (just like > > I am +1 on security). I just want to avoid repeating mistakes like the > > Avro server which isn't really supported despite being in our // Jonathan Hsieh (shay) // Software Engineer, Cloudera // [EMAIL PROTECTED] +
Jonathan Hsieh 2011-12-13, 19:57
-
Re: Code review request for hbase-4120 table priorityStack 2011-12-13, 20:53
On Tue, Dec 13, 2011 at 11:57 AM, Jonathan Hsieh <[EMAIL PROTECTED]> wrote:
> Note: I've only done a quick look at the jira and the code. The high level > design document/approach seems reasonable and I think most agree that this > is a useful feature and that a lot of effort has gone into it. > Agreed. > The feature is off by default -- I can see one main difference in this > situation compared to other major newish generally-considered experimental > or incomplete features (replication, off-heap slab cache, online schema > changes). This feature doesn't have one of the current HBase committers > using/testing it in their production environments or in their test > environment. > A couple of the lads are shaping up to back this feature. > This seems perfect for *a feature branch* as we talked briefly about at the > Pow-wow. There seem to be some problems identified that will result in > follow on issues (races mentioned). Using a branch would: > * make it available at apache allows devs to test it > * allows a committer who is championing this to test it by using it more > and to iron out glaring problems in environment, > * encourages and shepards the contributor allowing them to justify > continued effort, > * allows all of us to defer the decision to fold the feature into 0.94 (or > 0.96, or later) when more folks are familiar or comfortable with it. > Agree. > Who knows, maybe some of the TaoBao folks will eventually become committers. > Looking forward to the day.... St.Ack +
Stack 2011-12-13, 20:53
-
Re: Code review request for hbase-4120 table priorityStack 2011-12-13, 16:57
On Mon, Dec 12, 2011 at 11:23 PM, lars hofhansl <[EMAIL PROTECTED]> wrote:
> While I haven't looked (in depth) at the patch, yet, this is definitely a feature that will be extremely helpful > for Salesforce's multitenant architecture to isolate tenants and services from each other. > > While we don't have HBase in our production data centers, yet (working on it), I am certain that we will use this feature > eventually. > I think it fair to say we all need better support for multi-tenancy not only in hbase itself but also down through the hadoop layers. St.Ack +
Stack 2011-12-13, 16:57
-
Re: Code review request for hbase-4120 table priorityTodd Lipcon 2011-12-13, 01:11
On Mon, Dec 12, 2011 at 4:30 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> I've become aware of several private forks of HDFS and HBase. Too bad. A pooling of dev resources would have almost surely have been better. BTW, to this point -- some of the private forks of HDFS and HBase are due to the opposite problem. For example, FB branched off at 0.89 and still uses that branch in production because they've seen the trunk moving too *fast* and accepting too much new stuff. (at least according to emails a couple months ago - apologies if I misunderstood that and putting words in people's mouths) So in summary I think we have to find the balance. IMO our balance has been too far towards moving fast and not towards stability in the last year. As we grow up we need to shift back towards stability of the core. -Todd -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2011-12-13, 01:11
-
Re: Code review request for hbase-4120 table priorityyuzhihong@... 2011-12-13, 00:35
This feature is used in production at Taobao (China's EBay). You can find related description about cluster size, etc on the Jira.
My understanding for hbase-4120 is that we are making the code conform to Apache hbase standard. There is related change for web UI, etc. Without clear feedback from Apache hbase, it is somehow difficult for the contributor to pursue further development. Regards On Dec 12, 2011, at 6:03 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > If it's completely a coprocessor, then it seems we should let it bake > on github and only incorporate in core if we find that a number of the > core HBase users are using it in production. Am I misunderstanding the > implementation? (haven't looked at the most recent patch) > > -Todd > > On Mon, Dec 12, 2011 at 3:50 PM, <[EMAIL PROTECTED]> wrote: >> Waiting for review comments from other committers. >> The implementation is pluggable by using coprocessors. >> >> Cheers >> >> >> >> On Dec 12, 2011, at 5:43 PM, Stack <[EMAIL PROTECTED]> wrote: >> >>> On Mon, Dec 12, 2011 at 6:43 AM, <[EMAIL PROTECTED]> wrote: >>>> Hi, >>>> 4120 has gone through more than 20 revisions. >>>> >>>> Please provide your comments. >>>> >>>> I plan to integrate it this week. >>>> >>> >>> I'd suggest hold on commit until some other committers have had a >>> looksee. This is an important feature that we need to get right and >>> there is no need to rush it in. >>> >>> Thanks Ted (and thanks for the reviews so far), >>> St.Ack > > > > -- > Todd Lipcon > Software Engineer, Cloudera +
yuzhihong@... 2011-12-13, 00:35
|