Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override MiqWorker::Runner.start in MiqWebServerRunnerMixin #15880

Conversation

carbonin
Copy link
Member

@carbonin carbonin commented Aug 22, 2017

We need to do this because MiqWebServerRunnerMixin overrides .start_worker to start the rails server. This needs to happen.

Broken in 7776e35

Without this change, you cannot access the UI using foreman start

edit:

This allows us access to the worker object in run_single_worker.rb and also allows the web service workers to start the rails server which was broken in 7776e35.

The original change altered the call in run_single_worker.rb to call Runner#start rather than Runner.start_worker which missed this code when running a web server worker.

Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like this.

While I see the issue, and it needs to be fix, the idea with my original PR was that we weren't changing the behavior of the appliance, only workers run with the run_single_worker.rb script. This change makes it so setup_sigterm_trap is called regardless, which I don't think is a good thing.

Let me see if I can figure out a better way to handle it, but good catch with this... ugh...

@NickLaMuro
Copy link
Member

NickLaMuro commented Aug 22, 2017

@carbonin can we possibly do something like this instead (don't know if it is "better" though, but basically overwrite #start instead of .start_worker):

diff --git a/app/models/mixins/miq_web_server_runner_mixin.rb b/app/models/mixins/miq_web_server_runner_mixin.rb
index f901db3..cc1eebb 100644
--- a/app/models/mixins/miq_web_server_runner_mixin.rb
+++ b/app/models/mixins/miq_web_server_runner_mixin.rb
@@ -15,29 +15,26 @@ module MiqWebServerRunnerMixin
     at_exit { do_exit("Exit request received.") }
   end
 
-  module ClassMethods
-    def start_worker(*args)
-      runner = self.new(*args)
-      _log.info("URI: #{runner.worker.uri}")
+  def start
+    _log.info("URI: #{worker.uri}")
 
-      # Do all the SQL worker preparation in the main thread
-      runner.prepare
+    # Do all the SQL worker preparation in the main thread
+    prepare
 
-      # The heartbeating will be done in a separate thread
-      Thread.new { runner.run }
+    # The heartbeating will be done in a separate thread
+    Thread.new { run }
 
-      runner.worker.class.configure_secret_token
-      start_rails_server(runner.worker.rails_server_options)
-    end
+    worker.class.configure_secret_token
+    start_rails_server(worker.rails_server_options)
+  end
 
-    def start_rails_server(options)
-      require "rails/commands/server"
+  def start_rails_server(options)
+    require "rails/commands/server"
 
-      _log.info("With options: #{options.except(:app).inspect}")
-      Rails::Server.new(options).tap do |server|
-        Dir.chdir(Vmdb::Application.root)
-        server.start
-      end
+    _log.info("With options: #{options.except(:app).inspect}")
+    Rails::Server.new(options).tap do |server|
+      Dir.chdir(Vmdb::Application.root)
+      server.start
     end
   end
 end

@carbonin
Copy link
Member Author

Is there a reason we don't want this in the forking worker code?

Alternatively we could just hold this change off until we branch Gaprindashvilli?

@NickLaMuro
Copy link
Member

@carbonin Part of what the current implementation of MiqWorker::Runner is doing is handling SIGTERM in it's own way here:

https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_worker/runner.rb#L131

This won't get called normally unless a worker is killed via a timeout when stopping. Normally a stop signal is received by by processing it from messages received over DRB as part of the worker runner loop (called from heartbeat):

https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_worker/runner.rb#L330
https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_worker/runner.rb#L330
https://github.com/ManageIQ/manageiq/blob/master/app/models/miq_worker/runner.rb#L263

The SIGTERM rescue in #start is a "oh S4!#, I need to exit", but dumps any work that it was currently doing on the ground. I was going around this via the signal trap for workers using run_single_worker.rb because they won't want to stop what they are doing when a SIGTERM is called, but continue working until there current work is finished, then break (trap has a higher precedence then the rescue... basically). This is not functionality that we want in the normal workers though when SIGTERM is called.

Hopefully this makes sense...

@carbonin carbonin force-pushed the fix_run_single_worker_for_web_service_workers branch from 05b3a36 to 2969535 Compare August 22, 2017 21:50
@carbonin
Copy link
Member Author

Ah okay, thanks for the explanation @NickLaMuro

I was under the impression that you were replacing the current signal handling rather than trying to write something to go along side it.

I'm not sure that we send SIGTERM to any of our workers during normal operation (MiqWorker#kill is implemented with SIGKILL), but if you think there's a chance we do and we should preserve the original behavior, I'll go with your suggestion.

@carbonin carbonin changed the title Call .start_worker to start the runner in run_single_worker.rb Override MiqWorker::Runner.start in MiqWebServerRunnerMixin Aug 22, 2017
This allows us access to the worker object in run_single_worker.rb
and also allows the web service workers to start the rails server
which was broken in 7776e35.

The original change altered the call in run_single_worker.rb
to call Runner#start rather than Runner.start_worker which
missed this code when running a web server worker.
@carbonin carbonin force-pushed the fix_run_single_worker_for_web_service_workers branch from 2969535 to a000077 Compare August 22, 2017 21:52
@miq-bot
Copy link
Member

miq-bot commented Aug 22, 2017

Checked commit carbonin@a000077 with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
1 file checked, 0 offenses detected
Everything looks fine. ⭐

@NickLaMuro
Copy link
Member

I'm not sure that we send SIGTERM to any of our workers during normal operation (MiqWorker#kill is implemented with SIGKILL)

Yup, you are right @carbonin, my bad. That said, I would think that customers doing a SIGTERM or similar would be expecting it to die, and not finish what it is doing.

Anyway, hope my suggestion works 🤞

Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, though you might want to get a couple of other people's opinions on it.

@carbonin
Copy link
Member Author

@gtanzillo @Fryguy can you guys take a look at this?

Copy link
Member

@gtanzillo gtanzillo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@gtanzillo gtanzillo added this to the Sprint 68 Ending Sep 4, 2017 milestone Aug 28, 2017
@gtanzillo gtanzillo merged commit 3aaee9a into ManageIQ:master Aug 28, 2017
@carbonin carbonin deleted the fix_run_single_worker_for_web_service_workers branch October 13, 2017 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants