Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refs #30057 - Configure Pulpcore Worker Count #100

Merged
merged 1 commit into from
Jul 1, 2020

Conversation

wbclark
Copy link
Collaborator

@wbclark wbclark commented Jun 11, 2020

No description provided.

@wbclark
Copy link
Collaborator Author

wbclark commented Jun 15, 2020

@ehelms I checked with several of the largest enterprise Katello users and found that some do indeed use more than 8 workers for Pulp 2. The largest count I'm aware of in production is one organization that uses 16 workers in their deployment. That is an environment with extensive hardware resources and I believe around 65,000 Content Hosts.

For such use cases, I felt it was necessary to not set a hard limit of 8 workers but instead set an upper limit which is beyond what anyone could reasonably use.

So my intended behavior at the time that I wrote this is that default parameters will enable one worker per logical CPU up to the soft limit of 8, but users who understand the performance trade-offs should still have the option to increase that value up to a hard limit.

The hard limit wouldn't be strictly necessary at all, except that it provides a convenient way to disable and stop excess workers when the count is decreased by the user. With that said, perhaps there is a better pattern in Puppet for that but I'm not aware of it. My current implementation is admittedly pretty kludgy but - pulp{2,3} differences notwithstanding for now - should satisfy my 3 requirements of 1. sane default behavior for the majority of users, 2. ability to tune higher if really desired, and 3. prevent excess workers from remaining enabled in cases where the count is lowered.

@ehelms
Copy link
Member

ehelms commented Jun 16, 2020

Sounds good, once tests are passing I'm good with this change.

@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch 2 times, most recently from 4f52d4e to 5edcea8 Compare June 19, 2020 21:31
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theforeman/puppet-foreman#843 I chose a custom fact. In this case you don't have to worry about the migration so it's easier. You also don't have to deal with packaging placing a file that you need to clean up.

I don't mind merging this now (provided the lint comment is addressed), but it'd be nice to lift the artificial limit of 25.

@@ -115,6 +120,7 @@
Stdlib::Fqdn $servername = $facts['networking']['fqdn'],
Array[Stdlib::Absolutepath] $allowed_import_path = ['/var/lib/pulp/sync_imports'],
Optional[String] $remote_user_environ_name = undef,
Integer[1, 24] $worker_count = min(8, $facts['processors']['count']), # lint:ignore:parameter_documentation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you ignore this? It is documented now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was getting a lint failure that $facts['processors']['count'] is not documented. (Interesting to me that we didn't see the same issue with $facts['networking']['fqdn'] three lines above)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch from 5edcea8 to e7b7087 Compare June 23, 2020 12:53
@wbclark
Copy link
Collaborator Author

wbclark commented Jun 23, 2020

Thanks @ekohl for the suggestions. I've pushed some changes and I'll explain these below:

  1. I removed the #lint_ignore but I'm expecting it to complain due to a bug with the linter (we'll probably add that back for now while investigating a fix for the bug)

  2. I'm taking the suggested approach of using a custom fact, which allows me to lift the arbitrary limit of 24 workers. While I think it's unlikely anyone would ever need more than 24 workers, I think the code is a bit cleaner this way.

  3. TODO: I want to implement a test case of decreasing the worker count to have confidence about the above implementation

@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch 3 times, most recently from 15f4b4d to 51e167f Compare June 25, 2020 00:38
@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch 3 times, most recently from ad08f74 to 9f48370 Compare June 25, 2020 20:36
is_expected.to contain_service('pulpcore-worker@5')
.with_ensure('running')
.with_enable('true')
is_expected.to contain_service('pulpcore-worker@6')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be true, you need to set the fact:

let(:facts) { super().merge(pulpcore_workers: ['pulpcore-worker@6.service']) }

I'd also make this a separate it block since it's something different you want to ensure from the previous one. The test with worker 5 ensures worker_count sets the right number of running workers. Worker 6 tests that the purging happens.

If you want to test more edge cases, you'd test with:

  • worker count 5
    • fact missing
    • fact present, but empty
    • fact with 6 workers
    • fact with 6 workers

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekohl how could the fact be missing? my understanding is that the fact should always be present, and only if the current worker count is zero should it be empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice I made in puppet-foreman was not to generate a fact if the directory didn't exist. Similar, you could check for the existence of the pulp-worker@.service file. One benefit of not presenting it is that you don't generate the fact for every host in your Puppet environment. That helps with fact storage, though I don't know how much it really matters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @ekohl . In that case I think it does matter, c.f. https://access.redhat.com/solutions/3764171

I'll check for pulpcore-worker@service in the fact definition

@ekohl ekohl linked an issue Jun 26, 2020 that may be closed by this pull request
@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch from 9f48370 to 9c67dd0 Compare June 28, 2020 23:48
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than repeating it every time I'd suggest to nest:

context 'with worker_count set to 5' do
  let(:params) { { worker_count: 5} }

  context 'with 6 workers from fact' do
    let(:facts) { super().merge(pulpcore_workers: (1..6).map { |n| "pulpcore-worker@#{n}.service" } }

    it 'is expected to reduce the workers to 5' do
      # ...
    end
  end

  context 'with 4 workers from fact' do
    # ...
  end

  context 'with 0 workers from fact' do
    # ...
  end
end

That's how rspec is supposed to be used and you can see the same structure if you run the tests (in verbose mode).


it 'is expected to reduce the workers to 5' do
is_expected.to compile.with_all_deps
is_expected.to contain_service('pulpcore-worker@5')
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ekohl could you recommend syntax to actually loop over workers 1 through 5 here rather than just checking the fifth worker?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Ruby you can do it:

it 'is expected to reduce the workers to 5' do
  (1..5).each do |n|
    is_expected.to contain_service("pulpcore-worker@#{n}")
  end
end

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't sure about nesting it within it...do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.rubyguides.com/2016/02/ruby-procs-and-lambdas/ might be interesting to read. The signature is roughly it(message = nil, &block). Once you know that, it feels more familiar.

@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch 7 times, most recently from 868aa2a to 92d140f Compare July 1, 2020 18:20
@wbclark wbclark force-pushed the 30057__pulp3_workers_count branch from 92d140f to c40ccab Compare July 1, 2020 18:52
@wbclark wbclark merged commit 1e41eb3 into theforeman:master Jul 1, 2020
@wbclark wbclark added the Enhancement New feature or request label Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request Needs testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Figure out how to make pulp-worker count configurable
4 participants