|
|
-
Sorting user defined MR counters.
Niels Basjes 2012-12-29, 23:32
Hi,
I've had this 'itch' with Hadoop that it is hard to sort the counters in a "nice" way. Now the current trunk sorts the framework counters in such a way that they follow the flow quite nicely. For the generic counters (i.e. user code counters) this is not possible.
I've been playing around these last few days to see if I can extend the API so I can create custom counters from my MR code and have the framework report them in the sorting order I defined that is useful for my specific application.
I currently have a working version of this idea here so I'm wondering ...
Is this something you would like to have in the main source tree? From my point of view this is very generic and reusable by many projects.
If you say 'yes' then I'll simply create a Jira for this and submit the patch.
In addition I have a question about the content of such a patch. In some source files I'll be changing there are numerous very basic warnings from both the Java compiler, findbugs and checkstyle. With simple things I mean silly things "unused imports", "an interface that has directives like private and public", "first line of javadoc must end with '.'", "unused SuppressWarnings directives" .
I my normal job I commit a source file at least "just as clean" as what I got to start with but preferably I make it "cleaner". That way a warning is something you look at instead of "part of the landslide" which people tend to ignore.
Now for submitting changes for Hadoop: Is it desirable that I fix these in my change set or should I leave these as-is to avoid "obfuscating" the changes that are relevant to the Jira at hand?
-- Best regards / Met vriendelijke groeten,
Niels Basjes
+
Niels Basjes 2012-12-29, 23:32
-
Re: Sorting user defined MR counters.
Steve Loughran 2013-01-02, 10:35
On 29 December 2012 23:32, Niels Basjes <[EMAIL PROTECTED]> wrote:
> Hi, > > I've had this 'itch' with Hadoop that it is hard to sort the counters in a > "nice" way. > Now the current trunk sorts the framework counters in such a way that they > follow the flow quite nicely. For the generic counters (i.e. user code > counters) this is not possible. > > I've been playing around these last few days to see if I can extend the API > so I can create custom counters from my MR code and have the framework > report them in the sorting order I defined that is useful for my specific > application. > > I currently have a working version of this idea here so I'm wondering ... > > Is this something you would like to have in the main source tree? From my > point of view this is very generic and reusable by many projects. > > If you say 'yes' then I'll simply create a Jira for this and submit the > patch. > > In addition I have a question about the content of such a patch. > In some source files I'll be changing there are numerous very basic > warnings from both the Java compiler, findbugs and checkstyle. > With simple things I mean silly things "unused imports", "an interface that > has directives like private and public", "first line of javadoc must end > with '.'", "unused SuppressWarnings directives" . >
yes, the Java codebase is a bit messy. It's the price of review-then-commit; it adds overhead to do cleanups as you go along, although doing a code cleanup is something that many people would appreciate. > > I my normal job I commit a source file at least "just as clean" as what I > got to start with but preferably I make it "cleaner". That way a warning is > something you look at instead of "part of the landslide" which people tend > to ignore. > > Now for submitting changes for Hadoop: Is it desirable that I fix these in > my change set or should I leave these as-is to avoid "obfuscating" the > changes that are relevant to the Jira at hand? >
I recommend a cleanup first -that's likely to go in without any argument. Your patch with the new features would be a diff against the clean, so have less changes to be reviewed.
+
Steve Loughran 2013-01-02, 10:35
-
Re: Sorting user defined MR counters.
Niels Basjes 2013-01-07, 15:57
Hi Steve,
> Now for submitting changes for Hadoop: Is it desirable that I fix these in > > my change set or should I leave these as-is to avoid "obfuscating" the > > changes that are relevant to the Jira at hand? > > > > I recommend a cleanup first -that's likely to go in without any argument. > Your patch with the new features would be a diff against the clean, so have > less changes to be reviewed. > Ok, I'll have a look what I can do. Should I focus on fixing problems within the entire code base or limit my changes to a limited set of subprojects (i.e. only the mapreduce ones) ?
-- Best regards / Met vriendelijke groeten,
Niels Basjes
+
Niels Basjes 2013-01-07, 15:57
-
Re: Sorting user defined MR counters.
Steve Loughran 2013-01-08, 10:48
On 7 January 2013 15:57, Niels Basjes <[EMAIL PROTECTED]> wrote:
> Hi Steve, > > > Now for submitting changes for Hadoop: Is it desirable that I fix these > in > > > my change set or should I leave these as-is to avoid "obfuscating" the > > > changes that are relevant to the Jira at hand? > > > > > > > I recommend a cleanup first -that's likely to go in without any argument. > > Your patch with the new features would be a diff against the clean, so > have > > less changes to be reviewed. > > > > > Ok, I'll have a look what I can do. > Should I focus on fixing problems within the entire code base or limit my > changes to a limited set of subprojects (i.e. only the mapreduce ones) ? > > -- > I'd pick something self contained -cross-project changes are much harder to get reviewed and in-
+
Steve Loughran 2013-01-08, 10:48
|
|