Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow use of systemd for worker management #18648

Merged
merged 49 commits into from
May 30, 2019

Conversation

agrare
Copy link
Member

@agrare agrare commented Apr 10, 2019

Allow for workers to be run and managed by systemd. This follows exactly the containerized worker model so it doesn't remove a lot of code, but I see this as a way to get it in and start testing.

A worker consists of the following component:

A target which groups multiple logical units together so they can be managed as a group

cat /etc/systemd/system/generic.target
[Unit]
PartOf=cfme.target

A service template which defines the options for the service, this is the basis for multiple unit services like most queue workers

cat /etc/systemd/system/generic@.service 
[Unit]
PartOf=generic.target
[Install]
WantedBy=generic.target
[Service]
Environment=HOME=/root
WorkingDirectory=/var/www/miq/vmdb
ExecStart=/bin/bash -lc 'exec ruby lib/workers/bin/run_single_worker.rb MiqGenericWorker --heartbeat --guid=%i'
Restart=always
Slice=cfme-generic.slice

A settings directory where the per worker settings are stored:

cat /etc/systemd/system/generic@.d/settings.conf 
[Service]

MemoryHigh=524288000
TimeoutStartSec=600
TimeoutStopSec=600

When running it looks like this:

systemctl status cfme.slice
● cfme.slice
   Loaded: loaded
   Active: active since Wed 2019-04-10 11:43:32 EDT; 4h 48min ago
    Tasks: 16
   Memory: 846.1M
   CGroup: /cfme.slice
           ├─cfme-event_handler.slice
           │ └─event_handler@0e8dbcc4-5ee2-4c66-9704-9c89b5edfcf8.service
           │   └─25044 MIQ: MiqEventHandler id: 88, queue: ems
           ├─cfme-generic.slice
           │ └─generic@5a6638a6-c3fa-4287-92c4-b574d176548f.service
           │   └─25222 MIQ: MiqGenericWorker id: 89, queue: generic
           ├─cfme-priority.slice
           │ └─priority@899c9287-7d7b-4dad-981a-b67001d06c9a.service
           │   └─25052 MIQ: MiqPriorityWorker id: 90, queue: generic
           ├─cfme-reporting.slice
           │ └─reporting@1aa08cef-7579-40fb-ae09-194b3a4252d9.service
           │   └─25056 MIQ: MiqReportingWorker id: 91, queue: reporting
           └─cfme-schedule.slice
             └─schedule@a51d1818-776d-4c21-a9f2-537dedfad128.service
               └─25060 MIQ: MiqScheduleWorker id: 92

Apr 10 11:43:32 agrare-thinkpad systemd[1]: Created slice cfme.slice.

We can get logs by slice:

journalctl --unit=cfme-generic.slice
-- Logs begin at Wed 2019-04-10 11:41:48 EDT, end at Wed 2019-04-10 16:33:02 EDT. --
Apr 10 11:43:32 agrare-thinkpad systemd[1]: Created slice cfme-generic.slice.
Apr 10 11:43:39 agrare-thinkpad ruby[2261]:  INFO -- : MIQ(Vmdb::Loggers.apply_config) Log level for journald has been changed to [INFO]
Apr 10 11:43:39 agrare-thinkpad ruby[2261]:  INFO -- : MIQ(Vmdb::Loggers.apply_config) Log level for azure.log has been changed to [WARN]
Apr 10 11:43:42 agrare-thinkpad bash[2261]: ** Using session_store: ActionDispatch::Session::MemCacheStore
Apr 10 11:43:42 agrare-thinkpad bash[2261]: ** ManageIQ master, codename: Ivanchuk
Apr 10 11:43:43 agrare-thinkpad MIQ: MiqGenericWorker id: 87, queue: generic[2261]:  INFO -- : MIQ(MiqGenericWorker::Runner#sync_config) ID [87], PID [2261], GUID [327d1248-7257-41c9-8d18-ae82b5e913d5], Zone [d
Apr 10 11:43:43 agrare-thinkpad MIQ: MiqGenericWorker id: 87, queue: generic[2261]:  INFO -- :

And also only for example show log levels warn or above, journalctl --unit=cfme-generic.slice --priority=warning

TODO:

  • Define how to handle provider workers, do we use the unit instance for the ems_id (can't handle ems_id and count then) or do we write out unit files with the ems_id builtin to the unit file env vars, or can we use the env vars in the unit settings
  • Remove dependence on temporary worker instance column

Depends on: ManageIQ/manageiq-loggers#5

Gemfile Outdated Show resolved Hide resolved
app/models/miq_worker.rb Outdated Show resolved Hide resolved
app/models/miq_worker.rb Outdated Show resolved Hide resolved
app/models/miq_worker.rb Outdated Show resolved Hide resolved
lib/miq_environment.rb Outdated Show resolved Hide resolved
lib/vmdb/loggers.rb Outdated Show resolved Hide resolved
@Fryguy Fryguy self-assigned this Apr 25, 2019
app/models/miq_worker/systemd_common.rb Show resolved Hide resolved
lib/miq_environment.rb Outdated Show resolved Hide resolved
lib/miq_environment.rb Outdated Show resolved Hide resolved
app/models/miq_worker.rb Outdated Show resolved Hide resolved
@agrare agrare force-pushed the use_systemd_for_worker_management branch 10 times, most recently from f5b311f to 225cd1f Compare April 29, 2019 17:54
@agrare agrare force-pushed the use_systemd_for_worker_management branch from 8b62825 to 9ba6c3a Compare May 20, 2019 14:00
@miq-bot
Copy link
Member

miq-bot commented May 20, 2019

Some comments on commits agrare/manageiq@ff0febc~...9ba6c3a

lib/workers/bin/run_single_worker.rb

  • ⚠️ - 69 - Detected puts. Remove all debugging statements.

@miq-bot
Copy link
Member

miq-bot commented May 20, 2019

Checked commits agrare/manageiq@ff0febc~...9ba6c3a with ruby 2.3.3, rubocop 0.69.0, haml-lint 0.20.0, and yamllint 1.10.0
10 files checked, 9 offenses detected

app/models/miq_worker/systemd_common.rb

lib/miq_environment.rb

lib/vmdb/loggers.rb

lib/workers/bin/run_single_worker.rb

@carbonin
Copy link
Member

@Fryguy @agrare is this ready to go?

@agrare
Copy link
Member Author

agrare commented May 30, 2019

@carbonin it is ready from my side

@carbonin carbonin assigned carbonin and unassigned Fryguy May 30, 2019
@carbonin carbonin merged commit c95298f into ManageIQ:master May 30, 2019
@carbonin carbonin added this to the Sprint 113 Ending Jun 10, 2019 milestone May 30, 2019
@carbonin
Copy link
Member

So this is still off by default. Do we know who from QE we should rope in to see if they can run some tests with this enabled?

@agrare agrare deleted the use_systemd_for_worker_management branch May 30, 2019 14:17
@agrare
Copy link
Member Author

agrare commented May 30, 2019

@dmetzger57 can we get QE to setup an appliance with this feature turned on?

@dmetzger57
Copy link
Contributor

I'll work with QE to get a resource(s) allocated for initial testing, @agrare should I have then contact you directly with any specific questions?

@agrare
Copy link
Member Author

agrare commented May 30, 2019

@agrare should I have then contact you directly with any specific questions?

Definitely not 😆 of course

jrafanie added a commit to jrafanie/manageiq that referenced this pull request Nov 25, 2019
Make systemd optional on systemd enabled systems

Api/web service worker needs ui-classic as it can try to read an existing UI session, which
can contain serialized classes from ui-classic.

Previously, we tried removing fork here: ManageIQ#16130
It was reverted here: ManageIQ#16154

Some of the followups needed to fix the original problems including passing down
ems_id to per ems workers, resolved in:
ManageIQ#16199 and
ManageIQ#18648

At this point, things should just work.
jrafanie added a commit to jrafanie/manageiq that referenced this pull request Nov 25, 2019
Make systemd optional on systemd enabled systems

Api/web service worker needs ui-classic as it can try to read an existing UI session, which
can contain serialized classes from ui-classic.

Previously, we tried removing fork here: ManageIQ#16130
It was reverted here: ManageIQ#16154

Some of the followups needed to fix the original problems including passing down
ems_id to per ems workers, resolved in:
ManageIQ#16199 and
ManageIQ#18648

At this point, things should just work.
jrafanie added a commit to jrafanie/manageiq that referenced this pull request Nov 25, 2019
Make systemd optional on systemd enabled systems

Api/web service worker needs ui-classic as it can try to read an existing UI session, which
can contain serialized classes from ui-classic.

Previously, we tried removing fork here: ManageIQ#16130
It was reverted here: ManageIQ#16154

Some of the followups needed to fix the original problems including passing down
ems_id to per ems workers, resolved in:
ManageIQ#16199 and
ManageIQ#18648

At this point, things should just work.
jrafanie added a commit to jrafanie/manageiq that referenced this pull request Nov 25, 2019
Make systemd optional on systemd enabled systems

Api/web service worker needs ui-classic as it can try to read an existing UI session, which
can contain serialized classes from ui-classic.

Previously, we tried removing fork here: ManageIQ#16130
It was reverted here: ManageIQ#16154

Some of the followups needed to fix the original problems including passing down
ems_id to per ems workers, resolved in:
ManageIQ#16199 and
ManageIQ#18648

At this point, things should just work.
jrafanie added a commit to jrafanie/manageiq that referenced this pull request Nov 25, 2019
Make systemd optional on systemd enabled systems

Api/web service worker needs ui-classic as it can try to read an existing UI session, which
can contain serialized classes from ui-classic.

Previously, we tried removing fork here: ManageIQ#16130
It was reverted here: ManageIQ#16154

Some of the followups needed to fix the original problems including passing down
ems_id to per ems workers, resolved in:
ManageIQ#16199 and
ManageIQ#18648

At this point, things should just work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants