Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role #19055

carbonin · 2019-07-24T20:07:53Z

Before this change the actual playbook run could happen on any
appliance with the automate role. We need the playbook run to happen
specifically on the embedded_ansible appliance so that we can ensure
the repo update/clone process will run successfully.

cc @gmcculloug @tinaafitz I'm not sure if this can be done in any better way. I wasn't able to find a higher level place to change how this was getting queued. It seems like the actual call stack for execute ends somewhere in automate engine and drb so I feel like that would be harder to tackle than this.

But if someone knows that there's somewhere else I can make this change, then I'm open for suggestions.

NickLaMuro

Few questions/concerns, but my guess is they just need clarification for my stupid self and this probably is fine as is.

NickLaMuro · 2019-07-24T20:20:55Z

app/models/service_ansible_playbook.rb

+      :role        => "embedded_ansible"
+    }
+
+    task_id = MiqTask.generic_action_with_callback(task_opts, queue_opts)


I don't think this is worth holding up the merge of this PR, but this looks very similar to what is available through EmbeddedAnsible::CrudCommon:

manageiq/app/models/manageiq/providers/embedded_ansible/crud_common.rb

Lines 5 to 21 in 82fe638

def queue(zone, instance_id, method_name, args, action, auth_user)

task_opts = {

:action => action,

:userid => auth_user || "system"

}

queue_opts = {

:args => args,

:priority => MiqQueue::HIGH_PRIORITY,

:class_name => name,

:method_name => method_name,

:role => "embedded_ansible",

:zone => zone

}

queue_opts[:instance_id] = instance_id if instance_id

MiqTask.generic_action_with_callback(task_opts, queue_opts)

end

So it would nice if there was a clean way to share this code without having to dig into the provider code to do it. But again, I think this is fine as a surgical fix and I think trying to DRY this up can be done at a later date.

That said, I guess my question for this new implementation is what is now happening with this line (which exists in _execute now:

manageiq/app/models/service_ansible_playbook.rb

Line 28 in 82fe638

new_job = ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.create_job(jt, decrypt_options(opts))

Is that just going to cause the same issue that we are seeing now, just with another queue item in between it?

(Note: You can just expand the code in the diff below to see what I talking about, just shared a permalink since I can't comment on code that hasn't changed)

So I don't think it's really worth DRYing this particular bit up. I think this is kinda how the task/queue stuff is meant to be used. Plus I think there's a fair chance that the args here could change separately from the provider CRUD stuff.

I was thinking that there could be a use case for creating an MiqTask helper to do the wait_for_taskid and raise if the status is no-good, but we're not doing that in crud_common so ... shrug.

Is that just going to cause the same issue that we are seeing now, just with another queue item in between it?

No, at this point in the thread of execution we will queue new items for states in the job, but those are all queued for the same server here:

manageiq/app/models/manageiq/providers/ansible_runner_workflow.rb

Lines 62 to 78 in 82fe638

def queue_signal(*args, deliver_on: nil)

role = options[:role] || "ems_operations"

priority = options[:priority] || MiqQueue::NORMAL_PRIORITY

MiqQueue.put(

:class_name => self.class.name,

:method_name => "signal",

:instance_id => id,

:priority => priority,

:role => role,

:zone => zone,

:task_id => guid,

:args => args,

:deliver_on => deliver_on,

:server_guid => MiqServer.my_server.guid,

)

end

Ah, sorry, I see now. It is a bit farther down then the preview showed:

manageiq/app/models/manageiq/providers/ansible_runner_workflow.rb

Line 76 in 82fe638

:server_guid => MiqServer.my_server.guid,

I was seeing the *args and was trying to find where it was being passed in from 😩

Okay, cool, looks good to me.

NickLaMuro

Fat fingered the re-review request... not really knowing what it did.

Still good. 🐑 🇮🇹

Fryguy · 2019-07-24T21:46:44Z

app/models/service_ansible_playbook.rb

+    }
+
+    task_id = MiqTask.generic_action_with_callback(task_opts, queue_opts)
+    task = MiqTask.wait_for_taskid(task_id)


I don't think this should wait for task here, because that is effectively a blocking call (sleep loop, if I recall). I expected a new state that uses queue based waiting

The previous implementation didn't have a queue call at all so the blocking implementation is the same as it was previously.

Maybe put another way, there's a check_complete method, and we should probably check for the task completion there, else bail for the next check

At this stage the job itself was queued wasnt it? That was my intention at least, and then the "externally running" job was supposed to be checked in check_complete

At this stage the job itself was queued wasnt it?

That's what I was trying to figure out, but got lost in automate. Here's puts caller.join("\n") at the start of the old #execute method:

/home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `public_send' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:333:in `block in object_send' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:352:in `ar_method' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:362:in `ar_method' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:331:in `object_send' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (3 levels) in expose' /home/ncarboni/Source/manageiq/app/models/user.rb:283:in `with_user' /home/ncarboni/.gem/ruby/2.5.1/bundler/gems/manageiq-automation_engine-09c8bbdf6762/lib/miq_automation_engine/engine/miq_ae_method_service/miq_ae_service_model_base.rb:182:in `block (2 levels) in expose' /home/ncarboni/.rubies/ruby-2.5.1/lib/ruby/2.5.0/drb/drb.rb:1630:in `perform_without_block' /home/ncarboni/.rubies/ruby-2.5.1/lib/ruby/2.5.0/drb/drb.rb:1590:in `perform' /home/ncarboni/.rubies/ruby-2.5.1/lib/ruby/2.5.0/drb/drb.rb:1674:in `block (2 levels) in main_loop' /home/ncarboni/.rubies/ruby-2.5.1/lib/ruby/2.5.0/drb/drb.rb:1670:in `loop' /home/ncarboni/.rubies/ruby-2.5.1/lib/ruby/2.5.0/drb/drb.rb:1670:in `block in main_loop'

So up to this point the service was queued for execution, but for the automate role. I now need the job itself to run on the appliance with the embedded_ansible role.

ManageIQ::Providers::EmbeddedAnsible::AutomationManager::Job.create_job calls run on the passed job template which in turn creates the workflow and calls signal the first time, but the base signal implementation just calls the method in the same process and AnsibleRunnerWorkflow#queue_signal is set up to run the method on the same server here.

So to make a long story short, I'm pretty sure this is what we need to do 😅

I think my expectation of what create_job did was wrong, but I still contend that this should be async. In good with merge for now though and talking about that for follow-up since it's technically the same before/after as you've said

lfu · 2019-07-25T13:51:19Z

app/models/service_ansible_playbook.rb

@@ -14,6 +14,25 @@ def preprocess(action, add_options = {})
  end

  def execute(action)
+    task_opts = {
+      :action => "Executing Ansible Playbook",
+      :userid => "system"


Should this be the current user who ordered the service or the service owner?

Maybe? I wasn't sure if this was relevant as the user didn't seem to be used, but I can change it. How to I get the userid for the user who ordered the service (I would assume that's the better option of the two)?

I think this should actually stay as system, because it's just an internal placement mechanism, not something we need user context for. I think the "Executing Ansible Playbook" action name is what makes that confusing and we should change that to "Launching Ansible Job" or something.

Before this change the actual playbook run could happen on any appliance with the automate role. We need the playbook run to happen specifically on the embedded_ansible appliance so that we can ensure the repo update/clone process will run successfully.

miq-bot · 2019-07-26T14:55:43Z

Checked commits carbonin/manageiq@11d517d~...640a8a5 with ruby 2.4.6, rubocop 0.69.0, haml-lint 0.20.0, and yamllint 1.10.0
2 files checked, 0 offenses detected
Everything looks fine. 🍪

carbonin · 2019-07-26T14:56:40Z

Okay, @Fryguy I think I made the changes you wanted for this first pass. I'll work on a followup that makes this async by saving off the task id into the service options and using the check_completed method.

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role (cherry picked from commit d855e23)

simaishi · 2019-07-26T22:22:15Z

Ivanchuk backport details:

$ git log -1
commit 1ffdecca73ac3e6f13f71644e41b23cc4b8ac017
Author: Jason Frey <jfrey@redhat.com>
Date:   Fri Jul 26 17:32:07 2019 -0400

    Merge pull request #19055 from carbonin/queue_service_playbook_runs
    
    Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role
    
    (cherry picked from commit d855e23f1a6f3f5891f7a01d2c285c050b2618b1)

carbonin added bug lifecycle/services changelog/yes core/embedded ansible ivanchuk/yes labels Jul 24, 2019

carbonin requested review from Fryguy, tinaafitz and NickLaMuro July 24, 2019 20:07

carbonin assigned Fryguy Jul 24, 2019

NickLaMuro reviewed Jul 24, 2019

View reviewed changes

NickLaMuro approved these changes Jul 24, 2019

View reviewed changes

NickLaMuro self-requested a review July 24, 2019 20:47

NickLaMuro approved these changes Jul 24, 2019

View reviewed changes

Fryguy requested changes Jul 24, 2019

View reviewed changes

lfu reviewed Jul 25, 2019

View reviewed changes

Fryguy approved these changes Jul 25, 2019

View reviewed changes

carbonin added 2 commits July 26, 2019 10:53

These are not tower jobs

640a8a5

carbonin force-pushed the queue_service_playbook_runs branch from b08b2f9 to 640a8a5 Compare July 26, 2019 14:54

Fryguy merged commit d855e23 into ManageIQ:master Jul 26, 2019

Fryguy added this to the Sprint 117 Ending Aug 5, 2019 milestone Jul 26, 2019

simaishi pushed a commit that referenced this pull request Jul 26, 2019

Merge pull request #19055 from carbonin/queue_service_playbook_runs

1ffdecc

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role (cherry picked from commit d855e23)

simaishi added ivanchuk/backported and removed ivanchuk/yes labels Jul 26, 2019

carbonin deleted the queue_service_playbook_runs branch August 16, 2019 15:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role #19055

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role #19055

carbonin commented Jul 24, 2019

NickLaMuro left a comment

NickLaMuro Jul 24, 2019

NickLaMuro Jul 24, 2019

carbonin Jul 24, 2019

carbonin Jul 24, 2019

NickLaMuro Jul 24, 2019

NickLaMuro left a comment

Fryguy Jul 24, 2019

carbonin Jul 24, 2019

Fryguy Jul 24, 2019

Fryguy Jul 24, 2019

carbonin Jul 24, 2019

Fryguy Jul 24, 2019

lfu Jul 25, 2019

carbonin Jul 25, 2019

Fryguy Jul 25, 2019

miq-bot commented Jul 26, 2019

carbonin commented Jul 26, 2019

simaishi commented Jul 26, 2019

	def queue(zone, instance_id, method_name, args, action, auth_user)
	task_opts = {
	:action => action,
	:userid => auth_user \|\| "system"
	}

	queue_opts = {
	:args => args,
	:priority => MiqQueue::HIGH_PRIORITY,
	:class_name => name,
	:method_name => method_name,
	:role => "embedded_ansible",
	:zone => zone
	}
	queue_opts[:instance_id] = instance_id if instance_id
	MiqTask.generic_action_with_callback(task_opts, queue_opts)
	end

	def queue_signal(*args, deliver_on: nil)
	role = options[:role] \|\| "ems_operations"
	priority = options[:priority] \|\| MiqQueue::NORMAL_PRIORITY

	MiqQueue.put(
	:class_name => self.class.name,
	:method_name => "signal",
	:instance_id => id,
	:priority => priority,
	:role => role,
	:zone => zone,
	:task_id => guid,
	:args => args,
	:deliver_on => deliver_on,
	:server_guid => MiqServer.my_server.guid,
	)
	end

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role #19055

Queue ServiceAnsiblePlaybook#execute for the embedded_ansible role #19055

Conversation

carbonin commented Jul 24, 2019

NickLaMuro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NickLaMuro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

miq-bot commented Jul 26, 2019

carbonin commented Jul 26, 2019

simaishi commented Jul 26, 2019