Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-restart systemd workers results in duplicates #20824

Merged

Conversation

agrare
Copy link
Member

@agrare agrare commented Nov 18, 2020

Because the worker GUID is part of the systemd service when a systemd worker is restarted by systemd it will get a new PID but reuse the old GUID.

If MiqServer deletes the worker row because it doesn't see the PID then you end up with a service restarting and failing to find its row constantly.

Fixes #20822

Because the worker GUID is part of the systemd service when a systemd
worker is restarted by systemd it will get a new PID but reuse the old
GUID.  If MiqServer deletes the worker row because it doesn't see the
PID then you end up with a service restarting and failing to find its
row constantly.
@agrare
Copy link
Member Author

agrare commented Nov 18, 2020

I don't like where we write these files out, I'd prefer we write them once on startup and move on. Going to tackle that then take this out of WIP.

@agrare agrare force-pushed the dont_restart_systemd_worker_automatically branch from 08d5894 to 4b54bce Compare November 18, 2020 17:27
@agrare agrare changed the title [WIP] Auto-restart systemd workers results in duplicates Auto-restart systemd workers results in duplicates Nov 18, 2020
@miq-bot miq-bot removed the wip label Nov 18, 2020
When the systemd worker option was first introduced as a prototype there
was a setting that would control if systemd or the normal forking worker
mode was used.  This setting was per worker, and was done in
sync_workers so that you could change the setting "on-the-fly" without
restarting the entire evmserverd process.  This made it much easier to
try out using systemd on one or two workers very quickly.

Now that systemd is the default for appliances the ability to change
this dynamically isn't very helpful, and writing out systemd unit files
during sync_workers is just a lot of overhead.

Moving the writing of systemd unit files to miq_server startup removes
the overhead per-sync_workers-loop and allows us to update the contents
of the files without checking .exist? (which was done due to the
frequency with which it was being run).
@agrare agrare force-pushed the dont_restart_systemd_worker_automatically branch from 532a99d to d215045 Compare November 19, 2020 15:14
@miq-bot
Copy link
Member

miq-bot commented Nov 19, 2020

Checked commits agrare/manageiq@b734fad~...d215045 with ruby 2.6.3, rubocop 0.82.0, haml-lint 0.35.0, and yamllint
4 files checked, 0 offenses detected
Everything looks fine. ⭐

Copy link
Member

@jrafanie jrafanie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@chessbyte
Copy link
Member

LGTM 👍

@jrafanie are you waiting on something/someone before merging?

@jrafanie
Copy link
Member

@jrafanie are you waiting on something/someone before merging?

Just travis...it's green now, thanks for the ping

@jrafanie jrafanie merged commit 2f8ae23 into ManageIQ:master Nov 19, 2020
@agrare agrare deleted the dont_restart_systemd_worker_automatically branch November 19, 2020 17:31
simaishi pushed a commit that referenced this pull request Nov 19, 2020
…omatically

Auto-restart systemd workers results in duplicates

(cherry picked from commit 2f8ae23)
@simaishi
Copy link
Contributor

Kasparov backport details:

$ git log -1
commit 8c5ca97fd5ca2b6792396412b9a3afd9f63b5334
Author: Joe Rafaniello <jrafanie@users.noreply.github.com>
Date:   Thu Nov 19 12:30:37 2020 -0500

    Merge pull request #20824 from agrare/dont_restart_systemd_worker_automatically

    Auto-restart systemd workers results in duplicates

    (cherry picked from commit 2f8ae2345eb25b16ad87f7dc124b4fe68a610bdb)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MiqServer / Run Single Worker race condition when a Systemd Worker dies
5 participants