Below is a defense of my applying ACCUMULO-1395 to the 1.6.0-SNAPSHOT branch. Since it was reverted, my hope here is to successfully argue the case that it should be included:
The bin/bootstrap_config.sh script does not result in files that are appropriate for use on Hadoop 1. It requires manual modification to work. There are other reasons a user might need to modify these files before running, such as changing the default instance secret. However, bin/accumulo init warns users to edit that. It does not warn if you're using Hadoop 1. This is essentially a bug, because it's not obvious to users that the configuration generated will fail to work, and the error messages in 'bin/accumulo init' are not very informative as to how to fix the problem.
This makes it quite annoying for testing on Hadoop 1, as every new build requires manually editing config files to work. ACCUMULO-1395 solves this by adding an extra prompt to bootstrap_config.sh and generating config files which work right away.
However, that fix has been determined to be inappropriate for 1.6.0. Perhaps a smaller, more targeted fix, to add a prompt to the bootstrap_config.sh file, to alter the config files appropriately for Hadoop 1 should be done for 1.6.0? However, this fix is available now, and the only thing it might need, is a slight edit of the README to note that the bootstrap_config.sh prompts for Hadoop 1 or Hadoop 2, to generate examples.
While this change looks like a large change, it really isn't. Most of the changes are removing redundant configuration examples in favor of a single example, used as a template. Removing multiple copies of almost identical files in favor of a single set makes the patch look artificially large. It's actually a quite minor change. While the intention of ACCUMULO-1395 was ease of maintenance, it turns out it actually eases testing and user experience, by fixing the bugs in the previously produced config files from bootstrap_config.sh for Hadoop 1, which required manual intervention. The benefits for maintenance from the consolidation of the examples is secondary to this fix, at this point.
The common objections, and justification for reverting, make sense, but I think they are unwarranted in this case, because, while the ticket is labeled "Improvement", it actually fixes a bug. It also doesn't actually make improvements or add features to Accumulo itself... just examples and packaging improvements, which we've previously identified as acceptable to change in preparation for release, after the feature freeze. Further, it has been up on ReviewBoard for weeks, and John has responded to all the feedback. It has been thoroughly tested, works great, and has been ready for weeks. I know there may still be improvements or changes that could be done, but this fixes problems now, and eases testing on Hadoop 1.
If the objections stand, in favor of reverting and delaying to 1.7.0, perhaps a more targeted fix to bootstrap_config.sh would be acceptable, adding a prompt for Hadoop 1/Hadoop 2, which does not consolidate the examples? This could be done with additional example copies (easy), or some automated in-place editing after copying (not too much harder).
I too would be fine with this if the examples remained as-is for the 1.6.0 release.
I hadn't seen the review previously, because frankly I've been focused on things that impact 1.6.0. Since it's in submitted status I'll file follow on tickets for issues I have with the patch outside of the immediate matter of inclusion in 1.6.0.
-Sean On Sat, Apr 5, 2014 at 7:32 AM, Josh Elser <[EMAIL PROTECTED]> wrote:
Personally, I think the template serves as a sufficient example, and the generated files after executing the bootstrap_config script should also serve the same purpose, but I can appreciate the lower impact (especially to documentation that may refer to examples) of leaving the examples in place.
Other options for follow-on changes (for 1.7.0 and later) include: put examples in the docs folder instead of config, use the bootstrap_config.sh to generate the examples in the build, add header to the generated files to denote identify the options with which they were generated.
You have to think about the impact that you put on people downstream that are expecting those files to be in place. Packagers, most notably, would be affected. It's not just that there is an example that someone can interpret, but also automated processes or tutorials/howtos that people have written that expect these files to exist.
I do want to hear from Mike Drob and John Vines first before I take any further action on this, though. If they're okay with what Sean and Josh suggested (making the changes to 1.6.0 without removing the examples), I'll do that. If not, I'll re-apply the commit to 1.7.0 only.
Either way, I'll put in a follow-on ticket for auto-generating the example configs in the build and moving them from conf/ to docs/ in 1.7.0. I'll still wait for their feedback first, though, so I get the initial wording of those tickets right.
If there is a way to mark the changes as experimental in 1.6.0 I would be most happy with that, otherwise just applying them and leaving the examples is fine. Removing the examples from 1.7 is the way to go, in my opinion.
My main issue was that this change would be very surprising for downstream consumers who are trying to do deployments, and we haven't had very much time to test it. On Sat, Apr 5, 2014 at 9:00 PM, John Vines <[EMAIL PROTECTED]> wrote:
Okay, so it sounds like all those who objected to this for 1.6.0 are okay with the compromise of leaving the examples in place for 1.6.0. So, I'll proceed with that. Thanks, all!
(Mike, I'm not sure it how to mark it as experimental, or that it is experimental in the first place, but from what you've said, and clarification in IRC, it sounds like you're not going to be too hung up on that.)