Had some discussion w/ Dave Marion about the need to drop relatavie paths from internal metadata. From a user standpoint the requirement to possibly configure instance.dfs.uri and instance.dfs.dir if they might have relative paths is confusing over the long term. Also it places more of a maintenance burden on us if we need to ensure all bug fixes and new features work properly w/ relative paths.
What are our options and what should the timeline be? We could require the user to do something to remove all relative paths before before starting 1.7.0 for example.
Some of the things we discussed
* Provide a utility to rewrite all relative paths * Rework the volume replacement code to work w/ relative paths
A stand alone utility is tricky. Don't want to modify tablet metadata if the table is loaded. Thats why the volume replacement code has the tablets themselves do the replacement.
I like the idea of reworking the volume replacement code, but I do not like the idea of it happening automatically (like the first time 1.6.2 is started). Could possibly have a boolean config instance.volume.replaceRelative. When this is set, as tablets are loaded and when the GC starts relative paths would be replaced using current instance.dfs.* config or hdfs config.
Still uncertain about the best solution. Looking for the course of least user confusion and least maintenance. I think instance.volume.replaceRelative is a bit confusing from a user perspective.
What other options are there to solve this problem? Any issue w/ the premise?
I'd personally like to have instance.dfs.uri and instance.dfs.dir gone as soon as possible (1.7.0 and later), and I wouldn't want to keep around code that continues to work with relative paths at all, so given the two options, a utility seems the better of the two, because the only code that deals with them would be inside the utility.
Some of us were in favor of automatically re-writing all relative paths during the upgrade to 1.6.0, so that once it was fully up and running, all relative paths would be gone. So, I would not be opposed to automatically doing that in a future 1.6.x upgrade. I'm not a fan of the boolean config, because I think it should be transparent to the user and there's not really a need to expose internal metadata details to users. However, even if we went with this route, we'd still want to support a direct upgrade from 1.6.0 (and any other 1.6.x version that didn't force absolute paths), so a utility would still be needed.
On Tue, Jul 22, 2014 at 4:54 PM, Christopher <[EMAIL PROTECTED]> wrote: I am not a fan either. The concern that made me think of that possibility was that the user may want to ensure the config for resolving relative paths was correct before proceeding.
I do not want to see anything get re-written between a 1.6.1 system going down and a 1.6.2 system coming up. We have a wire compatibility promise amongst the double-dot releases, and parts moving around really make me nervous. I think it's just too big of a change.
I have no problem with rewriting anything in the internals between 1.6.x and 1.7.0 (or 2.0.0). Based on experience, it will be a lot harder to implement as a stand-alone utility, but I do not have strong preferences on stand-alone or part of the upgrade process. On Tue, Jul 22, 2014 at 8:37 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
On Wed, Jul 23, 2014 at 3:33 AM, Mike Drob <[EMAIL PROTECTED]> wrote: Are you concerned about the case where someone starts 1.6.2 and then needs to roll back to 1.6.1 because of a bug in 1.6.2? Replacing relative paths with absolute paths would not prevent 1.6.1 from running, because 1.6.1 can understand absolute paths.
if we do defer this to 1.7.0, then we can mitigate the risk of relative paths causing unexpected bugs by upgrading from 1.5 to 1.7 and exercising many features.
On Tue, Jul 22, 2014 at 9:37 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
I think to make this utility safe, accumulo would need the ability to take the metadata table offline inorder to update the root table. The root table would need to be taken offline inorder to safely update zookeeper. I don't think the tables can be taken offline. Also need to ensure nothing brings them online while the operation is running.
On Wed, Jul 23, 2014 at 3:33 AM, Mike Drob <[EMAIL PROTECTED]> wrote: One thing thats really screwy about the proposal to do this in 1.6.2 (and completely drop support for relative paths in 1.7.0), is that you have to run 1.6.2 before you can upgrade to 1.7.0. This is something Christopher pointed out in an offline discussion yesterday. Is this the concern you had? This may be the biggest reason not to do it. I think in practice most production users will end up on later bug fix versions of 1.6.0 anyway. No one runs 1.4.0 or 1.4.1 anymore. But not sure if we can count on that. If 1.6.1 is stable and works for a user, they may just stick with it.
On Wed, Jul 23, 2014 at 10:24 AM, John Vines <[EMAIL PROTECTED]> wrote: Christopher and I disuccsed this yesterday. After 1.6.2 does its thing, could put a special marker in zookeeper. 1.7.0 upgrade would not start until it see this.
As long as you can roll back, then I think it's fine. I also have an implicit promise on wire compatibility, but I don't think that is under huge risk here. On Wed, Jul 23, 2014 at 8:51 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
I think we've heard of people running 1.4.2 fairly recently (David's teams?) so I would not count of folk upgrading to the latest bugfix. I would really like to see a 1.7 that can upgrade from 1.6 On Wed, Jul 23, 2014 at 9:28 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
I think you're right, that there's no risk to wire compatibility in this issue, and the data version wouldn't change in any way, so rollback should be fine, too. The discussion here is simply a discussion of when/how to accelerate a metadata upgrade that was already in place in 1.6.0, but did so gradually in 1.6.0. The question at the heart of the matter here is: when can we safely write code that no longer has to have special considerations for relative paths written from older versions? Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, Jul 23, 2014 at 9:17 PM, Mike Drob <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext