-
Notifications
You must be signed in to change notification settings - Fork 352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(maint) Do not use visitors to synchronize modules #1161
Conversation
|
||
@full_path = File.join(@basedir, @dirname) | ||
@path = Pathname.new(File.join(@basedir, @dirname)) | ||
|
||
@puppetfile = R10K::Puppetfile.new(@full_path, | ||
{overrides: @options[:overrides], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug that was uncaught earlier, but exposed with the additional testing that comes from attempting to actually use this data at the Module level. It accidentally got rolled into a refactor commit but I'm happy to break it out if that would be better.
@@ -28,7 +28,7 @@ def initialize(opts, argv, settings) | |||
generate_types: @generate_types | |||
}, | |||
modules: { | |||
requested_modules: @argv, | |||
requested_modules: @argv.map.to_a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug in that was exposed with further functional testing. @argv
here acts like an array but is actually a CRI::ArgumentList that includes the Enumerable module. This causes problems with logging that calls inspect
on the value. This change was accidentally rolled into another refactor, but I can break it out if that's best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a docs comment to this method indicating the type as it comes in?
lib/r10k/module/base.rb
Outdated
def sync(opts={}) | ||
raise NotImplementedError | ||
end | ||
|
||
def will_sync? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should_sync?
Update instance variable to match method name (sans '?')?
mmm... I was testing with Ruby 2.7 locally. It looks like I may have used something that is specific to that version (everything lower than that version of Ruby failed in Travis). |
Prior to this a Puppetfile could declare a module with: ``` mod 'name' mod 'name', :latest mod 'name', 'version' mod 'name', key_one: 'foo', key_two: ... ``` The optional second parameter(s) were then passed down through the Puppetfile#add_module, Module.implements, and Module.new signatures as the `args` parameter. Being able to assume that the `args` parameter is a Hash simplifies the code in several places - the exception being the Forge Module implementation (which is the only module where anything but a Hash as the args was ever valid) and the DSL methods that convert user input into
The Git implementation was the only user of options to sync, and the value passed is already passed into the Module's initialization.
This consolidates the logging notifying users that a module is being synchronized. Each caller provided slightly different messages so this creates a new, generic message that should work for previous uses.
Prior to this when running `puppetfile install` users would receive a warning whenever processing a git module with a control_branch setting of :control_repo. This is because the puppetfile subcommands have no concept of a containing environment that they operate within. However, this warning should also be applicable when deploying git modules with non-git backed environments (eg SVN or "bare" environments). This warning is also only applicable to git backed modules. This patch moves the warning from the `puppetfile install` command to the git module initialization and changes the message to be more general.
Previously, the deploy module action would determine in the visit_module method if the visited module should be synchronized. This patch moves that logic into the Module::Base class and makes it available to subclasses via the `will_sync?` method. It also updates each module implementation's sync method to only act if `will_sync?` returns true.
For every module deployed via `deploy module`, if the `generate-types` flag was given, we would attempt to generate types for the environments containing that module. This could lead to type generation happening multiple times for each environment, though in practice users typically use this action to update a single module at a time. This moves the logic regarding type generation from the visit_module method on the deploy modules action to the visit_environment method where it makes more sense to do environment wide updates.
Previously, the module.sync method would be called within a visit_module method on each action controlling the sync. This removes the visit_module methods from actions and has the locations where `module.accept(visitor)` as called before call `module.sync` instead. The visitor interface would wrap every visit_* method in a rescue block and set the `visit_ok` instance variable to false if there was an issue. To keep this general behavior for each action we've add this behavior to the `call` method of the action.
259bf35
to
f575902
Compare
Looks like Array#intersection wasn't added until 2.7. I've replaced it with the (IMO less readable) Array#&. |
I should do a bit more functional testing as well, but I wanted folks to be able to start reviewing asap. |
lib/r10k/environment/with_modules.rb
Outdated
if !args.is_a?(Hash) | ||
args = { version: args } | ||
end | ||
|
||
add_module(name, args) | ||
end | ||
end | ||
|
||
# @param [String] name | ||
# @param [*Object] args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should update the docs string if we require this to be a hash.
lib/r10k/module/base.rb
Outdated
def sync(opts={}) | ||
raise NotImplementedError | ||
logger.info _("Checking module %{path}") % {path: path} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I like the word "checking" or not... So this gets logged any time we go to sync a module, right? And that is when we're deploying either a whole env or just this module, and the module may
- not exist
- exist but need updating
- exist and already be at the right version
Is there additional logging elsewhere to indicate which of these states we were in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for those higher level cases and not at info level. However, there are debug messages by lower level objects that can differentiate between those. I was going to use "Synchronizing" since internally it's called "sync" but I something stopped me from doing that - buy I don't remember why anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I kind of agree that synchronizing
might not be good, because sometimes we don't do anything (e.g. if the module is already in the right state). In that sense, "checking" is fine. I guess my mind just says "Checking for what?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also be fine moving back to the term "Deploying" since that was used previously in two the three previous messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For least-change, I'd also suggest going back to "Deploying". Note on the purpose of %{origin}
, that helps indicate in the logging where a module was defined, which can be useful when, for example, a module is defined both in the Puppetfile and by the environment, and the environment's version is overriding the definition from the Puppetfile.
@@ -28,7 +28,7 @@ def initialize(opts, argv, settings) | |||
generate_types: @generate_types | |||
}, | |||
modules: { | |||
requested_modules: @argv, | |||
requested_modules: @argv.map.to_a, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a docs comment to this method indicating the type as it comes in?
Personally I'm pro consolidating APIs so they don't take multiple types, particularly a language with heavy type inference like Ruby. The variability adds complexity that usually feels unnecessary. However, I understand it could be a breaking change. Are people using r10k as a library? Does Bolt, maybe? |
Would also love to get @reidmv 's feedback on this work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feelings about this.
I don't like calling sync() on all modules, along with settings, and having them self-decide what to do with that.
Would we be able to simplify this if we instead sort of "demoted" the Puppetfile?
Alternate algorith:
- [Action]
Visit deployment- [Deployment]
Choose, then visit, environments- [Environment]
Callenvironment.puppetfile.load()
Choose, then concurrently visit, modules- [Module]
Callmodule.sync()
- [Module]
- [Environment]
- [Deployment]
Would that work? Would demoting the Puppetfile and removing it from the visitor pattern accomplish the reduction in complexity that's needed?
Side note: laying it out like this, it seems like we could also decide to visit chosen environments concurrently, if we wanted to. We'd probably just need to move our queuing/concurrency code into an R10K::Action::Visitor#concurrently_visit
method, which accepted an array of other
objects instead of just one.
lib/r10k/module/base.rb
Outdated
def sync(opts={}) | ||
raise NotImplementedError | ||
logger.info _("Checking module %{path}") % {path: path} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For least-change, I'd also suggest going back to "Deploying". Note on the purpose of %{origin}
, that helps indicate in the logging where a module was defined, which can be useful when, for example, a module is defined both in the Puppetfile and by the environment, and the environment's version is overriding the definition from the Puppetfile.
lib/r10k/action/deploy/module.rb
Outdated
end | ||
else | ||
logger.debug1(_("Only updating modules %{modules}, skipping module %{mod_name}") % {modules: requested_mods.inspect, mod_name: mod.name}) | ||
mod.sync |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going through this commit-by-commit, so maybe I'll come around to this as I move forward, but... at first blush, I feel like this change (1901284) actually makes the code more confusing, rather than less. Admittedly, I've been working with r10k for a little while, so may be looking a little more favorably on visitor stuff than I would have when I first started trying to figure it out.
In general, I don't like the idea of putting decision-making code into module objects. I am in favor of keeping module objects dumb, being given only simple instructions and doing what they're told. I think we should be able to find a way to simplify the visitor stuff without losing that simplicity for module objects.
I don't quite follow what you mean by "demoting" the Puppetfile. But, if you'd like my idea of where this is going it's this:
Is something like that what you're thinking of when you say "demote" the Puppetfile?
As far as the calling sync and letting them decide... I don't really care. Having the module container make the call of what module gets sync called on it was my approach in #1145. In that case the container (in that case a Puppetfile) will need to separately track all the modules loaded, the content to be managed (and not purged), and the modules to actually sync. Having the module itself know if it should be updated is also fine by me and I believe closely tracks your suggestion in this comment: #1145 (comment). I don't think there's a huge difference between calling mod.sync when mod#sync is defined by: def sync
if should_sync?
do stuff...
end vs mod.sync if mod.should_sync? and def sync
do stuff...
end Assuming the I think it's a better separation of concerns for the caller of module.sync to decide but I also think it makes the caller more complicated for logic that is pretty trivially inlined into the module. So 🤷. I have a feeling you were less concerned with which object had the responsibility and more to do with the fact that the particular object being updated was the Puppetfile class? Regardless of the end state, I do think that having the module determine whether or not it syncs will be advantageous to the further refactors that I mentioned at the top of this comment (breaking up the Puppetfile class). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with @justinstoller about this, I'm 👍 to move it forward re: design.
There are a lot of good fixes in here. If we can move it forward, as well as get a solution for CODEMGMT-1415 in place, there will be room afterwards to start more gentle refactoring, if we want to and if it's important enough.
Small steps = good. 🙂
c8a7478
to
dbc1242
Compare
In removing the visit_module functionality logic that was duplicated within actions has been consolidated. Notably the info message in each action before it called
module.sync
was one of:And now is:
(Open to to a better message, btw).
By moving the
control_branch
logic from thepuppetfile install
command into Module::Git users will see the warning about invalid control branches if applicable during regular deploys as well (eg, if they are deploying git modules with the control_branch pattern in an SVN or Yaml backed environment).Giving the Module::Base have a method like
will_sync?
seems a bit unnecessary for the single use case of thedeploy module
action, however I think it will provide a useful start to the kind of behavior that Reid mentions in #1145 (comment).The first commit is a separate refactor that changes the API of Module.new to work more like the (Source|Environment).from_hash that is very similar to. I'm torn on whether that level of breaking change should go in. Would love to hear comments about how much it might be used in the wild (vs being a private API to R10K). I'm also happy to remove that commit as it isn't directly related to the remaining changes.
Please add all notable changes to the "Unreleased" section of the CHANGELOG in the format: