|
|
-
LimitedPrivate and HBase
Allen Wittenauer 2011-06-06, 16:45
I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well?
+
Allen Wittenauer 2011-06-06, 16:45
-
Re: LimitedPrivate and HBase
Todd Lipcon 2011-06-06, 17:00
On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > > I have some concerns over the recent usage of LimitedPrivate being > opened up to HBase. Shouldn't HBase really be sticking to public APIs > rather than poking through some holes? If HBase needs an API, wouldn't > other clients as well? >
IMO LimitedPrivate can be used to open an API for a specific project when it's not clear that the API is generally useful, and/or we anticipate the API might be pretty unstable. Marking it LimitedPrivate to HBase gives us the opportunity to talk to the HBase team and say "hey, we want to rename this without @Deprecation" or "hey, we're going to kill this, is that OK?" Making it true public, even if we call it Unstable, is a bit harder to move.
I agree that most of these things in the long run would be determined generally useful and made public.
Do you have a specific thing in mind?
-Todd -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2011-06-06, 17:00
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-06, 17:36
On Jun 6, 2011, at 10:00 AM, Todd Lipcon wrote:
> On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > >> >> >> I have some concerns over the recent usage of LimitedPrivate being >> opened up to HBase. Shouldn't HBase really be sticking to public APIs >> rather than poking through some holes? If HBase needs an API, wouldn't >> other clients as well? >> > > IMO LimitedPrivate can be used to open an API for a specific project when > it's not clear that the API is generally useful, and/or we anticipate the > API might be pretty unstable. Marking it LimitedPrivate to HBase gives us > the opportunity to talk to the HBase team and say "hey, we want to rename > this without @Deprecation" or "hey, we're going to kill this, is that OK?" > Making it true public, even if we call it Unstable, is a bit harder to move.
True, but what makes HBase a unique snowflake? What is the criteria by which an API gets opened to something outside of the Hadoop umbrella? Why shouldn't something be private to Hadoop just because HBase or someone else wants to use it?
> I agree that most of these things in the long run would be determined > generally useful and made public. > > Do you have a specific thing in mind?
Not really. I just wanted to take a temperature to see what other people are thinking and whether we actually have any guidelines. Doing this wild west style seems fraught with danger. ("You opened this up for them, why can't you up this up for us?")
(Although I must admit that the first usage of this--the HttpServer code--sort of left with a "Seriously? You must be kidding." kind of feeling about it.)
+
Allen Wittenauer 2011-06-06, 17:36
-
Re: LimitedPrivate and HBase
Stack 2011-06-06, 18:34
On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > > > I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well? >
HBase uses public APIs. A method we relied on went from protected to private. Fixing this brought on the flagging of HttpServer with LimitedPrivate (HttpServer, the class flagged, is mostly internal to Hadoop but HBase, because, in part, of its subproject provenance, extends HttpServer providing its UI reusing Hadoop's log level, thread dumping, etc., servlets)
The LimitedPrivate aid as I see it is to give folks pause the next time they mess with access/signatures or, if change is necessary, they'll know who to give the head-up to that change is coming -- should they choose to do so.
St.Ack
+
Stack 2011-06-06, 18:34
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-06, 20:45
On Jun 6, 2011, at 11:34 AM, Stack wrote:
> On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: >> >> >> I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well? >> > > HBase uses public APIs. A method we relied on went from protected to > private. Fixing this brought on the flagging of HttpServer with > LimitedPrivate (HttpServer, the class flagged, is mostly internal to > Hadoop but HBase, because, in part, of its subproject provenance, > extends HttpServer providing its UI reusing Hadoop's log level, thread > dumping, etc., servlets)
So, no, HBase does not actually use public APIs....
+
Allen Wittenauer 2011-06-06, 20:45
-
Re: LimitedPrivate and HBase
Andrew Purtell 2011-06-06, 23:22
On Mon, 6/6/11, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > So, no, HBase does not actually use public APIs....
We could instead cut and paste the code in question here from Hadoop and modify it for our purposes. (This was done for RPC, earlier.)
However other Hadoop-ish daemons may well want to extend HttpServer in this same way so the servlets for thread dumps and adjusting logging levels can be reused rather than cloned or reimplemented (or dropped). How HttpServer as implemented configures the container requires one to reach in a bit to set these up plus our own webapps.
Perhaps opening a jira for a cleaner framework for HttpServer extension could be useful?
Best regards,
- Andy
+
Andrew Purtell 2011-06-06, 23:22
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-06, 23:34
On Jun 6, 2011, at 4:22 PM, Andrew Purtell wrote: > > Perhaps opening a jira for a cleaner framework for HttpServer extension could be useful?
Sure. That's probably what should have happened to begin with rather than the quickly changing the API to a different classification. I was a bit surprised that the JIRA went through so quickly without much comment, but I assumed that's because it was HBase requesting or because so many people were on their way to Berlin.
That said, big picture issue: I still think there should be some official guidance on when LimitedPrivate is the right thing to do. I'd hate to see random groups requesting an API change "just this once" with no real leg to stand on whether it was appropriate or not.
+
Allen Wittenauer 2011-06-06, 23:34
-
Re: LimitedPrivate and HBase
Todd Lipcon 2011-06-07, 00:56
On Mon, Jun 6, 2011 at 4:34 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > On Jun 6, 2011, at 4:22 PM, Andrew Purtell wrote: > > > > Perhaps opening a jira for a cleaner framework for HttpServer extension > could be useful? > > Sure. That's probably what should have happened to begin with > rather than the quickly changing the API to a different classification. I > was a bit surprised that the JIRA went through so quickly without much > comment, but I assumed that's because it was HBase requesting or because so > many people were on their way to Berlin. >
Or because this is the sort of thing that could take weeks of discussion or just 5 minutes to unblock HBase from moving on to trunk. I'd rather have the weeks of discussion *after* the 5 minute patch, so people can continue to make progress. We've moved too slowly for too long. > > That said, big picture issue: I still think there should be some > official guidance on when LimitedPrivate is the right thing to do. I'd hate > to see random groups requesting an API change "just this once" with no real > leg to stand on whether it was appropriate or not. -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2011-06-07, 00:56
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-07, 01:05
On Jun 6, 2011, at 5:56 PM, Todd Lipcon wrote:
> Or because this is the sort of thing that could take weeks of discussion or > just 5 minutes to unblock HBase from moving on to trunk. I'd rather have the > weeks of discussion *after* the 5 minute patch, so people can continue to > make progress. We've moved too slowly for too long. I didn't realize trunk was coming out as a release next month.
Let's face it: this happened because it was HBase. If it was almost anyone else, it would have sat there.... and *that's* the point where I'm mainly concerned.
+
Allen Wittenauer 2011-06-07, 01:05
-
Re: LimitedPrivate and HBase
Todd Lipcon 2011-06-07, 01:08
On Mon, Jun 6, 2011 at 6:05 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > On Jun 6, 2011, at 5:56 PM, Todd Lipcon wrote: > > > Or because this is the sort of thing that could take weeks of discussion > or > > just 5 minutes to unblock HBase from moving on to trunk. I'd rather have > the > > weeks of discussion *after* the 5 minute patch, so people can continue to > > make progress. We've moved too slowly for too long. > > > I didn't realize trunk was coming out as a release next month. >
If all goes well, 0.22 will come out as a release some time in that timeframe. Stack has been getting HBase running on it. This patch was to fix 0.22. > > Let's face it: this happened because it was HBase. If it was almost > anyone else, it would have sat there.... and *that's* the point where I'm > mainly concerned. If you want to feel better, take a look at HDFS-941, HDFS-347, and HDFS-918 - these are patches that HBase has been asking for for nearly 2 years in some cases and haven't gone in. Satisfied?
-Todd -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2011-06-07, 01:08
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-07, 01:18
On Jun 6, 2011, at 6:08 PM, Todd Lipcon wrote: > >> >> Let's face it: this happened because it was HBase. If it was almost >> anyone else, it would have sat there.... and *that's* the point where I'm >> mainly concerned. > > > If you want to feel better, take a look at HDFS-941, HDFS-347, and HDFS-918 > - these are patches that HBase has been asking for for nearly 2 years in > some cases and haven't gone in. Satisfied?
These cases don't appear to be about re-classification of an API from private to semi-public. So no, I'm not. None of these appear to answer the base set of question:
- What is the real criteria for changing an API from private to limited? - How "closely related" does a project need to be to get this privilege?
(Yes, I've read the classification docs. That's too vague.)
I can tell you feel I'm picking on HBase, especially in light of my flat out rejection of the "we want to mmap() blocks" case. But if this reclassification had been with anything else outside of the Hadoop project, I would have asked the same thing. It raises important questions that we as a project need to answer.
+
Allen Wittenauer 2011-06-07, 01:18
-
Re: LimitedPrivate and HBase
Todd Lipcon 2011-06-07, 01:23
On Mon, Jun 6, 2011 at 6:18 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > These cases don't appear to be about re-classification of an API > from private to semi-public. So no, I'm not. None of these appear to > answer the base set of question: > > In the specific case of HttpServer, this API *used* to be semi-public, then it was made private as a side effect of another change. (I should know, I'm the one who made the change that accidentally privatized it... it wasn't for any thought-out reason) > - What is the real criteria for changing an API from private to > limited? > - How "closely related" does a project need to be to get this > privilege? > > (Yes, I've read the classification docs. That's too vague.) > > I can tell you feel I'm picking on HBase, especially in light of my > flat out rejection of the "we want to mmap() blocks" case. But if this > reclassification had been with anything else outside of the Hadoop project, > I would have asked the same thing. It raises important questions that we as > a project need to answer. Nah, I just think these "meta discussions" waste an awful lot of time that's better spent making real progress on the code, or reviewing the complex changes where extra eyes really make a big difference. http://www.bikeshed.com/-Todd -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2011-06-07, 01:23
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-07, 01:33
On Jun 6, 2011, at 6:23 PM, Todd Lipcon wrote: > > Nah, I just think these "meta discussions" waste an awful lot of time that's > better spent making real progress on the code, or reviewing the complex > changes where extra eyes really make a big difference. OK. That's make it easier to just -1 changes like this with reasoning such as "HBase is not a related project." Then we can go back working on core Hadoop.
+
Allen Wittenauer 2011-06-07, 01:33
-
Re: LimitedPrivate and HBase
Andrew Purtell 2011-06-08, 16:17
> From: Allen Wittenauer <[EMAIL PROTECTED]> > OK. That's make it easier to just > -1 changes like this with reasoning such as "HBase is not a > related project." Then we can go back working on core > Hadoop.
Seriously?
Forget what I said about filing a JIRA (and working on it) to give HttpServer an extensibility that possibly would past muster with you.
- Andy
+
Andrew Purtell 2011-06-08, 16:17
-
Re: LimitedPrivate and HBase
Todd Lipcon 2011-06-08, 16:39
On Wed, Jun 8, 2011 at 9:17 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> > From: Allen Wittenauer <[EMAIL PROTECTED]> > > OK. That's make it easier to just > > -1 changes like this with reasoning such as "HBase is not a > > related project." Then we can go back working on core > > Hadoop. > > Seriously? > > Forget what I said about filing a JIRA (and working on it) to give > HttpServer an extensibility that possibly would past muster with you. >
Please know that Allen's opinions are not representative of the whole community, and -1s without technical basis can be overridden.
I'll leave it at that to try to short circuit a potential flame war.
-Todd -- Todd Lipcon Software Engineer, Cloudera
+
Todd Lipcon 2011-06-08, 16:39
-
Re: LimitedPrivate and HBase
Eric Baldeschwieler 2011-06-09, 16:28
I'd like to see a proposal circulated to handle this concern in a maintainable way.
On Jun 8, 2011, at 9:17 AM, Andrew Purtell wrote:
>> From: Allen Wittenauer <[EMAIL PROTECTED]> >> OK. That's make it easier to just >> -1 changes like this with reasoning such as "HBase is not a >> related project." Then we can go back working on core >> Hadoop. > > Seriously? > > Forget what I said about filing a JIRA (and working on it) to give HttpServer an extensibility that possibly would past muster with you. > > - Andy > >
+
Eric Baldeschwieler 2011-06-09, 16:28
-
Re: LimitedPrivate and HBase
Tom White 2011-06-09, 17:27
Looking at current usage in Hadoop, there are only 4 LimitedPrivate references to HBase (the http, io.retry, ipc, and metrics packages in Common), and 2 references to Pig (the two LineRecordReader classes in MapReduce). The other LimitedPrivate references are all to HDFS or MapReduce. Given that Private means "Intended for use only within Hadoop itself" (according to the javadoc), we can replace these references with Private.
We could also change the remaining 6 cases of LimitedPrivate to Public (note that they are already annotated Evolving or Unstable), and deprecate LimitedPrivate. Would this allay people's concerns?
Cheers, Tom
On Thu, Jun 9, 2011 at 9:28 AM, Eric Baldeschwieler <[EMAIL PROTECTED]> wrote: > I'd like to see a proposal circulated to handle this concern in a maintainable way. > > On Jun 8, 2011, at 9:17 AM, Andrew Purtell wrote: > >>> From: Allen Wittenauer <[EMAIL PROTECTED]> >>> OK. That's make it easier to just >>> -1 changes like this with reasoning such as "HBase is not a >>> related project." Then we can go back working on core >>> Hadoop. >> >> Seriously? >> >> Forget what I said about filing a JIRA (and working on it) to give HttpServer an extensibility that possibly would past muster with you. >> >> - Andy >> >> > >
+
Tom White 2011-06-09, 17:27
-
Re: LimitedPrivate and HBase
Konstantin Boudnik 2011-06-09, 17:38
+1 on this. I think LimitedPrivate is somewhat moot. It seems to be a better way to just get rid of it. $0.02
Cos
On Thu, Jun 09, 2011 at 10:27AM, Tom White wrote: > Looking at current usage in Hadoop, there are only 4 LimitedPrivate > references to HBase (the http, io.retry, ipc, and metrics packages in > Common), and 2 references to Pig (the two LineRecordReader classes in > MapReduce). The other LimitedPrivate references are all to HDFS or > MapReduce. Given that Private means "Intended for use only within > Hadoop itself" (according to the javadoc), we can replace these > references with Private. > > We could also change the remaining 6 cases of LimitedPrivate to Public > (note that they are already annotated Evolving or Unstable), and > deprecate LimitedPrivate. Would this allay people's concerns? > > Cheers, > Tom > > On Thu, Jun 9, 2011 at 9:28 AM, Eric Baldeschwieler > <[EMAIL PROTECTED]> wrote: > > I'd like to see a proposal circulated to handle this concern in a maintainable way. > > > > On Jun 8, 2011, at 9:17 AM, Andrew Purtell wrote: > > > >>> From: Allen Wittenauer <[EMAIL PROTECTED]> > >>> ═ ═ OK. ═That's make it easier to just > >>> -1 changes like this with reasoning such as "HBase is not a > >>> related project." Then we can go back working on core > >>> Hadoop. > >> > >> Seriously? > >> > >> Forget what I said about filing a JIRA (and working on it) to give HttpServer an extensibility that possibly would past muster with you. > >> > >> ═- Andy > >> > >> > > > >
+
Konstantin Boudnik 2011-06-09, 17:38
-
Re: LimitedPrivate and HBase
Suresh Srinivas 2011-06-09, 17:56
Javadoc did not capture the intent well. Please see HADOOP-5073. We should fix the Javadoc to avoid confusion.
Even though there are only few instances of LimitedPrivate, I prefer retaining. I prefer to keep MiniDFSCluster LimitedPrivate and not support as as public interface. On 6/9/11 10:27 AM, "Tom White" <[EMAIL PROTECTED]> wrote:
> Looking at current usage in Hadoop, there are only 4 LimitedPrivate > references to HBase (the http, io.retry, ipc, and metrics packages in > Common), and 2 references to Pig (the two LineRecordReader classes in > MapReduce). The other LimitedPrivate references are all to HDFS or > MapReduce. Given that Private means "Intended for use only within > Hadoop itself" (according to the javadoc), we can replace these > references with Private. > > We could also change the remaining 6 cases of LimitedPrivate to Public > (note that they are already annotated Evolving or Unstable), and > deprecate LimitedPrivate. Would this allay people's concerns? > > Cheers, > Tom > > On Thu, Jun 9, 2011 at 9:28 AM, Eric Baldeschwieler > <[EMAIL PROTECTED]> wrote: >> I'd like to see a proposal circulated to handle this concern in a >> maintainable way. >> >> On Jun 8, 2011, at 9:17 AM, Andrew Purtell wrote: >> >>>> From: Allen Wittenauer <[EMAIL PROTECTED]> >>>> OK. That's make it easier to just >>>> -1 changes like this with reasoning such as "HBase is not a >>>> related project." Then we can go back working on core >>>> Hadoop. >>> >>> Seriously? >>> >>> Forget what I said about filing a JIRA (and working on it) to give >>> HttpServer an extensibility that possibly would past muster with you. >>> >>> - Andy >>> >>> >> >>
+
Suresh Srinivas 2011-06-09, 17:56
-
Re: LimitedPrivate and HBase
Konstantin Boudnik 2011-06-09, 18:02
On Thu, Jun 09, 2011 at 10:56AM, Suresh Srinivas wrote: > Javadoc did not capture the intent well. Please see HADOOP-5073. We should > fix the Javadoc to avoid confusion. > > Even though there are only few instances of LimitedPrivate, I prefer > retaining. I prefer to keep MiniDFSCluster LimitedPrivate and not support as > as public interface.
Suresh, but MiniDFSCluster is a test facility with a somewhat limited functionality. It isn't that important really if it has LimitedPrivate or Private annotation.
Cos
> On 6/9/11 10:27 AM, "Tom White" <[EMAIL PROTECTED]> wrote: > > > Looking at current usage in Hadoop, there are only 4 LimitedPrivate > > references to HBase (the http, io.retry, ipc, and metrics packages in > > Common), and 2 references to Pig (the two LineRecordReader classes in > > MapReduce). The other LimitedPrivate references are all to HDFS or > > MapReduce. Given that Private means "Intended for use only within > > Hadoop itself" (according to the javadoc), we can replace these > > references with Private. > > > > We could also change the remaining 6 cases of LimitedPrivate to Public > > (note that they are already annotated Evolving or Unstable), and > > deprecate LimitedPrivate. Would this allay people's concerns? > > > > Cheers, > > Tom > > > > On Thu, Jun 9, 2011 at 9:28 AM, Eric Baldeschwieler > > <[EMAIL PROTECTED]> wrote: > >> I'd like to see a proposal circulated to handle this concern in a > >> maintainable way. > >> > >> On Jun 8, 2011, at 9:17 AM, Andrew Purtell wrote: > >> > >>>> From: Allen Wittenauer <[EMAIL PROTECTED]> > >>>> ═ ═ OK. ═That's make it easier to just > >>>> -1 changes like this with reasoning such as "HBase is not a > >>>> related project." Then we can go back working on core > >>>> Hadoop. > >>> > >>> Seriously? > >>> > >>> Forget what I said about filing a JIRA (and working on it) to give > >>> HttpServer an extensibility that possibly would past muster with you. > >>> > >>> ═- Andy > >>> > >>> > >> > >> >
+
Konstantin Boudnik 2011-06-09, 18:02
-
Re: LimitedPrivate and HBase
Allen Wittenauer 2011-06-13, 16:51
On Jun 9, 2011, at 10:27 AM, Tom White wrote: > > We could also change the remaining 6 cases of LimitedPrivate to Public > (note that they are already annotated Evolving or Unstable), and > deprecate LimitedPrivate. Would this allay people's concerns? Thanks for doing the search Tom.
Rather than just flip them all to public, I'd like to propose the following:
a) LimitedPrivate APIs to non-Hadoop projects are allowed to exist for 1 minor release. b) When a LimitedPrivate to non-Hadoop project is created, a blocker JIRA for the next release is created. c) That JIRA should be used to determine whether the API should go private or public, and if the latter, what interface changes are required.
This gives us a roadmap to make determinations on whether the LimitedPrivate API was a success or not and gives us a timeline as to when they go away, rather than just hanging around forever in a limbo state.
I suspect that most of these will go public anyway, but it would be good to have some documentation on the how's, why's, etc. This could also be used to allay the fears of a "bad" public interface.
+
Allen Wittenauer 2011-06-13, 16:51
-
Re: LimitedPrivate and HBase
Konstantin Boudnik 2011-06-13, 17:10
On Mon, Jun 13, 2011 at 09:51AM, Allen Wittenauer wrote: > > On Jun 9, 2011, at 10:27 AM, Tom White wrote: > > > > We could also change the remaining 6 cases of LimitedPrivate to Public > > (note that they are already annotated Evolving or Unstable), and > > deprecate LimitedPrivate. Would this allay people's concerns? > > > Thanks for doing the search Tom. > > Rather than just flip them all to public, I'd like to propose the following: > > a) LimitedPrivate APIs to non-Hadoop projects are allowed to exist for 1 minor release. > b) When a LimitedPrivate to non-Hadoop project is created, a blocker JIRA for the next release is created. > c) That JIRA should be used to determine whether the API should go private > or public, and if the latter, what interface changes are required.
Allen, the idea looks reasonable yet an over-complicated for the dev. process. It seems rather better to stick with original Tom's proposal: e.g. eliminate LimitedPrivate and use just Private instead: lesser interfaces to track/decisions to make, more transparent the process will be, IMO.
> This gives us a roadmap to make determinations on whether the LimitedPrivate > API was a success or not and gives us a timeline as to when they go away, > rather than just hanging around forever in a limbo state. > > I suspect that most of these will go public anyway, but it would be good to > have some documentation on the how's, why's, etc. This could also be used > to allay the fears of a "bad" public interface.
+
Konstantin Boudnik 2011-06-13, 17:10
-
Re: LimitedPrivate and HBase
Sanjay Radia 2011-06-14, 19:58
On Jun 9, 2011, at 10:27 AM, Tom White wrote:
> Looking at current usage in Hadoop, there are only 4 LimitedPrivate > references to HBase (the http, io.retry, ipc, and metrics packages in > Common), and 2 references to Pig (the two LineRecordReader classes in > MapReduce). The other LimitedPrivate references are all to HDFS or > MapReduce. Given that Private means "Intended for use only within > Hadoop itself" (according to the javadoc), we can replace these > references with Private. Okay so that was incorrectly stated in the Javadoc - if you read > > We could also change the remaining 6 cases of LimitedPrivate to Public > (note that they are already annotated Evolving or Unstable), and > deprecate LimitedPrivate. Would this allay people's concerns? -1 I disagree with the proposed changes.
Most folks are missing the point of limited private. The Jira (HADOOP-5073) that created this classification gives a very detailed explanation of the motivation and purpose of the classification. Unfortunately most of the explanation in Jira were not copied to the Javadoc. My mistake here. I will file a jira to copy the classification documentation from the Jira to a Javadoc. BTW the the javadoc is incorrect > Private means "Intended for use only within > Hadoop itself" (according to the javadoc)
The definition in the Jira (Hadoop-5073) explains that private means project private. It is private to HDFS or private to MR etc. Private does not mean private to Hadoop - otherwise MR can use any internal private class inside HDFS. We don't want that. When we did the actual annotation tags the words project-private was simplified to private since folks felt it was too verbose.
The Jira states: >>> project-private the interface is for internal use within the project and should not be used by applications. It is subject to change at anytime without notice. Most interfaces of a project are project private. <<<<
My mistake for not checking the actual javadoc carefully when Jacob's annotations patch was committed. I will file a jira to copy the document from the Jira to the Javadoc.
I will post a longer email explaining my position and my -1 more clearly after I have had a chance to read all the emails carefully. sanjay
+
Sanjay Radia 2011-06-14, 19:58
-
Re: LimitedPrivate and HBase
Sanjay Radia 2011-06-14, 21:20
On Jun 14, 2011, at 12:58 PM, Sanjay Radia wrote:
> > > > -1 > I disagree with the proposed changes. > ..... > I will post a longer email explaining my position and my -1 more > clearly after I have had a chance to read all the emails carefully. > > > sanjay
Please don't take my -1 too strongly. It was NOT meant to be offensive. I saw a lot of +1s and wanted to make sure that this doesn't turn into a jira and a commit in few days.
sanjay
+
Sanjay Radia 2011-06-14, 21:20
-
Re: LimitedPrivate and HBase
Tom White 2011-06-15, 18:21
No offense taken.
Sanjay and I chatted offline. We disagree on whether Private should be scoped to the whole project (Hadoop), or to subprojects (Common, HDFS, MapReduce). The intent of HADOOP-5073 was for the latter, but I'm not convinced it buys us anything really, which is why I've been arguing for the former. However, what we have at the moment in the javadoc is misleading, so that at least needs clearing up via HADOOP-7391. Thanks for doing this Sanjay.
I do think (and I think Sanjay agrees with me here) we should move away from LimitedPrivate access for external projects by creating public APIs that we are prepared to support (note again that they may be Evolving or even Unstable) - either by marking existing (Limited)Private APIs as public, or by creating a new API. We should do this for the 6 cases I highlighted earlier in the thread.
Cheers, Tom
On Tue, Jun 14, 2011 at 2:20 PM, Sanjay Radia <[EMAIL PROTECTED]> wrote: > > On Jun 14, 2011, at 12:58 PM, Sanjay Radia wrote: > >> >> >> >> -1 >> I disagree with the proposed changes. >> ..... >> I will post a longer email explaining my position and my -1 more >> clearly after I have had a chance to read all the emails carefully. >> >> >> sanjay > > Please don't take my -1 too strongly. > It was NOT meant to be offensive. I saw a lot of +1s and wanted to make sure > that this doesn't turn into a jira and a commit in few days. > > sanjay
+
Tom White 2011-06-15, 18:21
-
Re: LimitedPrivate and HBase
Sanjay Radia 2011-06-16, 15:44
I have updated https://issues.apache.org/jira/browse/HADOOP-7391with a html file that is for adding as an overview.html to the classification package. The text is mostly what was in HADOOP-5073; i have added 3 FAQs.
+
Sanjay Radia 2011-06-16, 15:44
-
Re: LimitedPrivate and HBase
Ian Holsman 2011-06-07, 01:50
On Jun 6, 2011, at 9:18 PM, Allen Wittenauer wrote:
> > But if this reclassification had been with anything else outside of the Hadoop project, I would have asked the same thing. It raises important questions that we as a project need to answer.
"Life's not fair".:-) (couldn't resist.. sorry)
But in all seriousness, the project should treat every request on it's own merits, regardless of which project/person requests it. In actuality if you've been working closely with another project or know the person who is asking for it, you'll naturally put more weight on the request, as you have more trust that that person knows what they are doing, and aren't being lazy/poorly informed about requesting it.
I was wondering if there are other examples of interfaces you (or others) think should un-deprecated/made public.
+
Ian Holsman 2011-06-07, 01:50
-
Re: LimitedPrivate and HBase
Andrew Purtell 2011-06-08, 16:13
> I can tell you feel I'm picking on HBase, especially in light of my > flat out rejection of the "we want to mmap() blocks" case.
I for one understand the objection there.
Although it does negatively impact the work of a recent promising new contributor. As a project, HBase suffers for it. Of course that is no concern of HDFS.
On the other hand I do believe Todd has a point. MapReduce is perhaps the only constituency that HDFS really cares about. Any reasonable person would come to that conclusion after surveying submitted JIRAs and their resolution times (or not). Historically with HDFS the local itch, the concern of the big MapReduce shops, gets the scratch and others are of not much concern. Therefore there is unfortunate business that lingers today -- Facebook, StumbleUpon, Trend Micro, and others have effectively forked HDFS (0.20) in house for use with HBase, and nobody I know is seriously considering using HDFS 0.22 or TRUNK due to a lack of evidence that anyone with a stake in it is running it in production at scale. Past discussion to mend the breach with an HBase-friendly release of HDFS 0.20 ended with what I would describe as an inflexible and legalistic air.
Best regards,
- Andy
Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --- On Mon, 6/6/11, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> From: Allen Wittenauer <[EMAIL PROTECTED]> > Subject: Re: LimitedPrivate and HBase > To: [EMAIL PROTECTED] > Date: Monday, June 6, 2011, 6:18 PM > > On Jun 6, 2011, at 6:08 PM, Todd Lipcon wrote: > > > >> > >> Let's face it: this > happened because it was HBase. If it was almost > >> anyone else, it would have sat there.... and > *that's* the point where I'm > >> mainly concerned. > > > > > > If you want to feel better, take a look at HDFS-941, > HDFS-347, and HDFS-918 > > - these are patches that HBase has been asking for for > nearly 2 years in > > some cases and haven't gone in. Satisfied? > > These cases don't appear to be about > re-classification of an API from private to > semi-public. So no, I'm not. None of these > appear to answer the base set of question: > > - What is the real criteria for changing > an API from private to limited? > - How "closely related" does a project > need to be to get this privilege? > > (Yes, I've read the classification > docs. That's too vague.) > > I can tell you feel I'm picking on > HBase, especially in light of my flat out rejection of the > "we want to mmap() blocks" case. But if this > reclassification had been with anything else outside of the > Hadoop project, I would have asked the same thing. It > raises important questions that we as a project need to > answer.
+
Andrew Purtell 2011-06-08, 16:13
-
Re: LimitedPrivate and HBase
Steve Loughran 2011-06-08, 16:36
On 06/08/2011 05:13 PM, Andrew Purtell wrote: >> I can tell you feel I'm picking on HBase, especially in light of my >> flat out rejection of the "we want to mmap() blocks" case. > > I for one understand the objection there. > > Although it does negatively impact the work of a recent promising new contributor. As a project, HBase suffers for it. Of course that is no concern of HDFS. > > On the other hand I do believe Todd has a point. MapReduce is perhaps the only constituency that HDFS really cares about. Any reasonable person would come to that conclusion after surveying submitted JIRAs and their resolution times (or not). Historically with HDFS the local itch, the concern of the big MapReduce shops, gets the scratch and others are of not much concern. Therefore there is unfortunate business that lingers today -- Facebook, StumbleUpon, Trend Micro, and others have effectively forked HDFS (0.20) in house for use with HBase, and nobody I know is seriously considering using HDFS 0.22 or TRUNK due to a lack of evidence that anyone with a stake in it is running it in production at scale. Past discussion to mend the breach with an HBase-friendly release of HDFS 0.20 ended with what I would describe as an inflexible and legalistic air. >
well, today MR is the primary constituency, but to be a stack you do have to make the otyher layers work. MR, with Hive and Pig on top, HBase, mahout.
These extra layers can form part of the regression tests for the underlying code: if a change breaks HBase or Hive, that's something to catch early, and say "this change to hadoop-common broke it".
yes, it's extra hassle dealing with changes that break things, but you find the problems so end users don't have to. And Jenkins can be set up to do much of the work, you just tweak the dependencies of the downstream projects to use the svn.trunk or -SNAPSHOT version of your code, run the builds in the right order to generate the artifacts, and wait for the emails to come in.
-steve
+
Steve Loughran 2011-06-08, 16:36
-
Re: LimitedPrivate and HBase
Stack 2011-06-09, 18:30
On Mon, Jun 6, 2011 at 4:34 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > On Jun 6, 2011, at 4:22 PM, Andrew Purtell wrote: >> >> Perhaps opening a jira for a cleaner framework for HttpServer extension could be useful? > > Sure. That's probably what should have happened to begin with rather than the quickly changing the API to a different classification.
It was done in HADOOP-2024 back in 2007.
St.Ack
+
Stack 2011-06-09, 18:30
-
Re: LimitedPrivate and HBase
Stack 2011-06-09, 18:20
On Mon, Jun 6, 2011 at 1:45 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: > > On Jun 6, 2011, at 11:34 AM, Stack wrote: > >> On Mon, Jun 6, 2011 at 9:45 AM, Allen Wittenauer <[EMAIL PROTECTED]> wrote: >>> >>> >>> I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well? >>> >> >> HBase uses public APIs. A method we relied on went from protected to >> private. Fixing this brought on the flagging of HttpServer with >> LimitedPrivate (HttpServer, the class flagged, is mostly internal to >> Hadoop but HBase, because, in part, of its subproject provenance, >> extends HttpServer providing its UI reusing Hadoop's log level, thread >> dumping, etc., servlets) > > So, no, HBase does not actually use public APIs.... >
The move from protected to private was an error rectified by HADOOP-7351. St.Ack
+
Stack 2011-06-09, 18:20
-
Re: LimitedPrivate and HBase
Steve Loughran 2011-06-07, 09:09
On 06/06/2011 05:45 PM, Allen Wittenauer wrote: > > > I have some concerns over the recent usage of LimitedPrivate being opened up to HBase. Shouldn't HBase really be sticking to public APIs rather than poking through some holes? If HBase needs an API, wouldn't other clients as well? > >
Hey, isn't HBase part of the official Hadoop stack?
+
Steve Loughran 2011-06-07, 09:09
-
Re: LimitedPrivate and HBase
Ted Dunning 2011-06-07, 10:10
Not really. It isn't part of any Hadoop release. And no official Hadoop release will run hbase reliably.
On Tue, Jun 7, 2011 at 2:09 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:
> On 06/06/2011 05:45 PM, Allen Wittenauer wrote: > >> >> >> I have some concerns over the recent usage of LimitedPrivate being >> opened up to HBase. Shouldn't HBase really be sticking to public APIs >> rather than poking through some holes? If HBase needs an API, wouldn't >> other clients as well? >> >> >> > Hey, isn't HBase part of the official Hadoop stack? >
+
Ted Dunning 2011-06-07, 10:10
|
|