Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1058] Avoid concurrent cache updates when downloading modules on multiple threads #1081

Merged
merged 5 commits into from
Jul 30, 2020

Conversation

Magisus
Copy link
Collaborator

@Magisus Magisus commented Jul 27, 2020

Supersedes #1080

justinstoller and others added 2 commits July 27, 2020 13:34
When we raised the default threadpool for deploying modules above 1, we
began to see issues when caching a remote locally if that remote/cache
was shared by multiple modules within the Puppetfile.

To resolve this, when we create the module we group it with any other
module that may share its cachedir. We then pull modules by cachedir off
of the queue and serially process those.
@Magisus
Copy link
Collaborator Author

Magisus commented Jul 27, 2020

Still interested in thoughts on #1059 (comment).

CHANGELOG.mkd Outdated Show resolved Hide resolved
CHANGELOG.mkd Outdated Show resolved Hide resolved
mwaggett
mwaggett previously approved these changes Jul 27, 2020
@Magisus Magisus force-pushed the concurrent-cache branch 2 times, most recently from aada897 to cc987ac Compare July 27, 2020 21:02
CHANGELOG.mkd Outdated Show resolved Hide resolved
@Magisus Magisus force-pushed the concurrent-cache branch from cc987ac to 0272397 Compare July 27, 2020 21:18
mwaggett
mwaggett previously approved these changes Jul 27, 2020
@Magisus
Copy link
Collaborator Author

Magisus commented Jul 28, 2020

This does increase the default pool_size again, I think that's what we wanted?

@Magisus
Copy link
Collaborator Author

Magisus commented Jul 28, 2020

There was a request way back when for an acceptance test that demonstrates this bug (and its fix) so I'm working on that now.

@Magisus
Copy link
Collaborator Author

Magisus commented Jul 28, 2020

I verified that this test failed with a pool size greater than one, before this fix. I had forgotten, but I'll note here (as well as in the commit message) for posterity that the error only occurs in shellgit.

Magisus added 2 commits July 28, 2020 13:28
This commit changes the default number of threads used when downloading
modules from 1 to 4.
@Magisus Magisus force-pushed the concurrent-cache branch from f4235f8 to c5ccccc Compare July 28, 2020 20:28
CONF

puppetfile = <<-EOS
mod 'non_module_object_1',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean for this to be a 'non module object'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The probably don't have to be for this test to work, I might change that.

But some useful links on that topic: https://tickets.puppetlabs.com/browse/CODEMGMT-649, https://puppet.com/docs/pe/2018.1/puppetfile.html#declare_git_repositories_in_the_puppetfile

@Magisus
Copy link
Collaborator Author

Magisus commented Jul 28, 2020

I'll update the test to be a slightly less-weird setup. There's no reason most of this can't be a more usual workflow and still demonstrate this bug.

@Magisus Magisus force-pushed the concurrent-cache branch from c5ccccc to eb9e026 Compare July 28, 2020 22:44
@Magisus
Copy link
Collaborator Author

Magisus commented Jul 28, 2020

@mwaggett see if that's a little clearer. I left the source repo alone for consistency, like we discussed, but updated what's in the Puppetfile to be less unusual and added a couple of assertions to better illustrate the result of the deploy.

git_repo_path = '/git_repos'
git_repo_name = 'environments'
git_control_remote = File.join(git_repo_path, "#{git_repo_name}.git")
code_dir = '/etc/puppetlabs/code/environments/production'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the env_path be incorporated into this somehow? Is it "#{env_path}/production" or "/etc/puppetlabs/code/#{env_path}/production"? Looks like the former, based on https://puppet.com/docs/puppet/5.5/environments_creating.html#environmentpath .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env_path is the absolute path, so /etc/puppetlabs/code/environments. But yeah it can be incorporated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, yeah, I think if we include env_path, it's clearer that the git_repo_name has no relationship to that path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated


mod 'test_apache',
:git => 'git://github.com/puppetlabs/puppetlabs-apache.git',
:branch => 'master'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more realistic for this to be the 'test' branch?

Is it worth testing both ^the more realistic case and a case where the sources are literally identical? Or is that kind of redundant/unnecessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That module only has a master and a release branch in its repo, and I'm not sure if release is permanent. I need to make sure it's something that this test is unlikely to fail on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users in the r10k community were unable to come up with a real use case that would trigger this bug. So no matter what we do here, it's not likely to match up to any known real workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, okay 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could find a different module that has more branches, or create a fixture, but we use apache elsewhere in these tests so I went with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't think the two cases are different, I'm fine leaving it

This commit adds a beaker acceptance test verifying that it is possible
to use the same remote for several non-module objects. This case
previously broke when the `pool_size` feature was introduced, because
there was a race condition when creating the git cache in shellgit mode,
when the same remote was used in two objects being set up concurrently.
@Magisus Magisus force-pushed the concurrent-cache branch from eb9e026 to fbc8461 Compare July 28, 2020 23:04
@Magisus
Copy link
Collaborator Author

Magisus commented Jul 29, 2020

@adrienthebo I'd appreciate a look at this if you have a minute, since the original concern with WithModules was yours, and Justin and I don't see an issue, but maybe we're missing something? #1059 (comment)

Copy link

@dhollinger dhollinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had one question, otherwise looks good to me

@@ -212,15 +221,22 @@ def concurrent_accept(visitor, pool_size)
def modules_queue(visitor)
Queue.new.tap do |queue|
visitor.visit(:puppetfile, self) do
modules.each { |mod| queue << mod }
modules_by_cachedir = modules_by_vcs_cachedir.clone

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something here or should modules_by_vcs_cachedir be an instance variable or is this appropriate since it's available as an attr_reader?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this calls the attr_reader method for which the default implementation is to simply return the instance variable of the same name. We added the reader so we could retrieve that value in the tests. This could just as easily be @modules_by_vcs_cachedir.clone, since we're in the class where the variable is definied, but as I understand it, it's generally good practice to use the reader method if one is provided. Makes it easier to override with a custom getter later if you end up needing to do something special when retrieving the value.

@mwaggett mwaggett merged commit 48ac619 into puppetlabs:master Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants