Update playbook_log_stdout to handle possible nil job #20738

djberg96 · 2020-10-27T12:33:20Z

Currently the AnsiblePlaybookMixin#playbook_log_stdout method assumes that a job will always be present. However, if we look at service_ansible_playbook.rb, we see this call:

playbook_log_stdout(log_option, job(action))

And the job(action) looks like this:

service_resources.find_by(:name => action, :resource_type => 'OrchestrationStack').try(:resource)

We know from the try call that this could thus be nil. And, in fact, that's what @d-m-u hit while testing a variant of #20645 where there's a network interruption:

ERROR -- : Q-task_id([r88_service_template_provision_task_88]) The following error occurred during instance method <on_error> for AR object <#<ServiceAnsiblePlaybook id: 88, name: "drew", <snip>
[----] E, [2020-10-26T14:09:27.389968 #1580547:2ab9680952a8] ERROR -- : Q-task_id([r88_service_template_provision_task_88]) MiqAeServiceModelBase.ar_method raised: <NoMethodError>: <undefined method `name' for nil:NilClass>
[----] E, [2020-10-26T14:09:27.390025 #1580547:2ab9680952a8] ERROR -- : Q-task_id([r88_service_template_provision_task_88]) /var/www/miq/vmdb/app/models/mixins/ansible_playbook_mixin.rb:40:in `rescue in playbook_log_stdout'
/var/www/miq/vmdb/app/models/mixins/ansible_playbook_mixin.rb:34:in `playbook_log_stdout'

So this PR updates the method so that it skips logging job.name info if there isn't actually a name.

No specs for now, there aren't any for this mixin currently, and I wasn't sure if they were moved somewhere else.

djberg96 · 2020-10-27T12:33:46Z

CC @NickLaMuro, @d-m-u

d-m-u · 2020-10-27T12:45:44Z

app/models/mixins/ansible_playbook_mixin.rb

@@ -35,9 +35,11 @@ def playbook_log_stdout(log_option, job)
 return unless %(on_error always).include?(log_option)
 return if log_option == 'on_error' && job.raw_status.succeeded?

- $log.info("Stdout from ansible job #{job.name}: #{job.raw_stdout('txt_download')}")
+ $log.info("Stdout from ansible job #{job.name}: #{job.raw_stdout('txt_download')}") if job


in the event we don't have a job, do we want to log that we're missing a job?

@tinaafitz, @NickLaMuro, @billfitzgerald0120 What say you?

who are the callers of this method and what is a reasonable expectation?

Hi @chessbyte,

This method gets called from the postprocess state.
https://github.com/ManageIQ/manageiq-content/blob/bb73854dd8d6b1e9647390538ef78046b123f608/content/automate/ManageIQ/Service/Generic/StateMachines/GenericLifecycle.class/__methods__/postprocess.rb#L19

The postprocess state is processed after the playbook has been launched. A valid job is a reasonable expectation.

See my comment here: #20738 (comment) . IMO callers should be expected to pass a valid job. I think the only thing that should change in this method is to add a guard clause:

raise ArgumentError, "must pass a valid job" if job.nil?

Fryguy · 2020-10-27T14:38:24Z

I'm surprised we don't have a job as we create the job first before we kick off the run. I think we need some more investigation into how we get into this situation before we band-aid it.

NickLaMuro

Just some thoughts.

NickLaMuro · 2020-10-27T14:47:33Z

app/models/mixins/ansible_playbook_mixin.rb

 rescue StandardError => err
- $log.error("Failed to get stdout from ansible job #{job.name}")
+ msg = "Failed to get stdout from ansible job"
+ msg << " #{job.name}" if job


A few things here:

Would it make sense to rescue NoMethodError above here as well? We could avoid a few of the if calls above by doing that

To address @d-m-u 's concern, could we change the code here to this:

msg = "Failed to get stdout from ansible job #{job ? job.name : '<UknownJob>'}"

That said, I think he concern raised by @chessbyte (here) and @Fryguy (here) are valid, so that might make the above suggesting unnecessary if we determine the root cause for this should normally not happen (I am not sure myself to be honest).

djberg96 · 2020-10-27T15:18:00Z

@Fryguy, @NickLaMuro I'm actually curious why there's a rescue clause at all in there. All it does is return or log. So, unless the logging itself fails (because coincidentally the job is nil), I'm curious how it would ever reach that condition.

NickLaMuro · 2020-10-27T15:39:34Z

@djberg96 not that this really answers your question, but looks like it has always been this way since it was introduced:

#16414

tinaafitz · 2020-10-27T15:46:41Z

Dan and I discussed the issue and I don't think this PR will be necessary.

djberg96 · 2020-10-27T16:55:33Z

FWIW, I added some specs. So, if it turns out we just want a quick fix, this will still be here.

Fryguy · 2020-10-27T17:39:38Z

Dan and I discussed the issue and I don't think this PR will be necessary.

What was the conclusion?

tinaafitz · 2020-10-28T18:26:42Z

Hi @Fryguy, It doesn't make sense that we don't have a valid job here. I'm starting to look at the environment now.

tinaafitz · 2020-10-30T15:37:30Z

After further investigation, this code change is important.

Here's why:

The check_connection Automate method is called in the pre5 state which is called prior to the execute state where the job is created.
The method times out after the max_retries = 4 which causes the pre5 state to end in an error.
The update_status Automate method is called on_error. It calls the ServiceAnsiblePlaybook on_error method:
https://github.com/ManageIQ/manageiq-content/blob/master/content/automate/ManageIQ/Service/Generic/StateMachines/GenericLifecycle.class/__methods__/update_status.rb#L64
Which calls:
https://github.com/ManageIQ/manageiq/blob/master/app/models/service_ansible_playbook.rb#L89
Which calls postprocess:
https://github.com/ManageIQ/manageiq/blob/master/app/models/service_ansible_playbook.rb#L93
which assumes a valid job exists.

It's important to note that customers will encounter this issue if they have custom Automate methods in any of the pre* states of this state machine that end in error.

Automate State Machine
/Service/Generic/StateMachines/GenericLifecycle/provision state machine

Fryguy · 2020-10-30T18:12:27Z

To me, then, this feels like the caller to playbook_log_stdout should be the one to change to check that the job is real. If anything changes in playbook_log_stdout, IMO, it should be to raise an ArgumentError if someone passes nil.

That is, since postprocess seems shared between good and bad runs, it also has to be resilient to good and bad runs. Ultimately postprocess calls log_stdouthere

manageiq/app/models/service_ansible_playbook.rb

Lines 172 to 175 in 647e640

 def log_stdout(action) 

 log_option = options.fetch_path(:config_info, action.downcase.to_sym, :log_output) || 'on_error' 

 playbook_log_stdout(log_option, job(action)) 

 end

So I think that method should be changed to account for a nil job....something like:

 def log_stdout(action) 
   log_option = options.fetch_path(:config_info, action.downcase.to_sym, :log_output) || 'on_error' 
   job = job(action)
   if job.nil?
     $log.info("No stdout available due to missing job") # or something to that effect.
   else
     playbook_log_stdout(log_option, job) 
   end
 end

djberg96 · 2020-11-02T13:47:48Z

@Fryguy ok, updated

tinaafitz · 2020-11-02T14:30:02Z

Thanks @Fryguy for pointing that out. That makes sense.

tinaafitz · 2020-11-02T14:36:34Z

Thanks @djberg96

Fryguy · 2020-11-02T15:12:58Z

@djberg96 Can you squash the commits? The commit history now shows a change to the method and then an unchange to the method.

app/models/service_ansible_playbook.rb

spec/models/mixins/ansible_playbook_mixin_spec.rb

Verify job exists in conditional. Initial specs. Add another spec. Restore playbook_log_stdout, modify caller instead.

djberg96 · 2020-11-02T15:53:45Z

@Fryguy commits squashed.

miq-bot · 2020-11-02T16:45:18Z

Checked commits https://github.com/djberg96/manageiq/compare/deb1ed05d8c3ef1bc8c40197039219b74eb8a5f6~...5a3d48e12890dba5cfa534d07b760c3bc3bf76f9 with ruby 2.6.3, rubocop 0.82.0, haml-lint 0.35.0, and yamllint
3 files checked, 11 offenses detected

spec/models/mixins/ansible_playbook_mixin_spec.rb

simaishi · 2020-11-09T17:50:18Z

@djberg96 Can this be ivanchuk/yes (which implies jansa/yes and kasparov/yes)?

djberg96 · 2020-11-09T17:52:41Z

@dmetzger57 What say you?

simaishi · 2020-11-09T17:55:23Z

@djberg96 I mean the compatibility. Are the changes in this PR compatible with ivanchuk branch that it's ok to backport?

dmetzger57 · 2020-11-09T17:55:50Z

The change should be ivanchuk/yes if cleanly backportable

djberg96 · 2020-11-09T17:58:23Z

@miq-bot add_label ivanchuk/yes

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002) https://bugzilla.redhat.com/show_bug.cgi?id=1835226

simaishi · 2020-11-09T18:54:50Z

Ivanchuk backport details:

$ git log -1
commit 3ae0f417dccf908881e1922117f6911f06a48f73
Author: Keenan Brock <keenan@thebrocks.net>
Date:   Wed Nov 4 11:36:54 2020 -0500

    Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

    Update playbook_log_stdout to handle possible nil job

    (cherry picked from commit 349300251dfc90b0933daf3a35faf6abfe9653b2)

    https://bugzilla.redhat.com/show_bug.cgi?id=1835226

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)

simaishi · 2020-11-09T18:56:34Z

Jansa backport details:

$ git log -1
commit dffb46b9e15736334522f21c765c78fc12000003
Author: Keenan Brock <keenan@thebrocks.net>
Date:   Wed Nov 4 11:36:54 2020 -0500

    Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

    Update playbook_log_stdout to handle possible nil job

    (cherry picked from commit 349300251dfc90b0933daf3a35faf6abfe9653b2)

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)

simaishi · 2020-11-09T18:58:23Z

Kasparov backport details:

$ git log -1
commit 949a51e4fb5f87cc2c62957f77b8031e2dc6648a
Author: Keenan Brock <keenan@thebrocks.net>
Date:   Wed Nov 4 11:36:54 2020 -0500

    Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

    Update playbook_log_stdout to handle possible nil job

    (cherry picked from commit 349300251dfc90b0933daf3a35faf6abfe9653b2)

djberg96 requested review from agrare, Fryguy and kbrock as code owners October 27, 2020 12:33

d-m-u reviewed Oct 27, 2020

View reviewed changes

NickLaMuro reviewed Oct 27, 2020

View reviewed changes

djberg96 requested a review from gtanzillo as a code owner October 27, 2020 16:52

d-m-u approved these changes Nov 2, 2020

View reviewed changes

Fryguy reviewed Nov 2, 2020

View reviewed changes

app/models/service_ansible_playbook.rb Show resolved Hide resolved

Fryguy reviewed Nov 2, 2020

View reviewed changes

spec/models/mixins/ansible_playbook_mixin_spec.rb Show resolved Hide resolved

Update playbook_log_stdout to handle possible nil job.

deb1ed0

Verify job exists in conditional. Initial specs. Add another spec. Restore playbook_log_stdout, modify caller instead.

Raise ArgumentError in playbook_log_stdout, add spec, if job is nil.

5a3d48e

Fryguy approved these changes Nov 2, 2020

View reviewed changes

d-m-u mentioned this pull request Nov 2, 2020

don't call postprocess without job in playbook on_error #20773

Merged

tinaafitz approved these changes Nov 3, 2020

View reviewed changes

kbrock merged commit 3493002 into ManageIQ:master Nov 4, 2020

miq-bot added the ivanchuk/yes label Nov 9, 2020

simaishi pushed a commit that referenced this pull request Nov 9, 2020

Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

3ae0f41

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002) https://bugzilla.redhat.com/show_bug.cgi?id=1835226

simaishi pushed a commit that referenced this pull request Nov 9, 2020

Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

dffb46b

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)

simaishi pushed a commit that referenced this pull request Nov 9, 2020

Merge pull request #20738 from djberg96/ansible_playbook_log_stdout

949a51e

Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)

simaishi added ivanchuk/backported jansa/backported kasparov/backported and removed ivanchuk/yes labels Nov 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update playbook_log_stdout to handle possible nil job #20738

Update playbook_log_stdout to handle possible nil job #20738

djberg96 commented Oct 27, 2020

djberg96 commented Oct 27, 2020

d-m-u Oct 27, 2020

djberg96 Oct 27, 2020

chessbyte Oct 27, 2020

tinaafitz Oct 27, 2020

Fryguy Oct 30, 2020

Fryguy commented Oct 27, 2020

NickLaMuro left a comment

NickLaMuro Oct 27, 2020

djberg96 commented Oct 27, 2020 •

edited

Loading

NickLaMuro commented Oct 27, 2020

tinaafitz commented Oct 27, 2020

djberg96 commented Oct 27, 2020

Fryguy commented Oct 27, 2020

tinaafitz commented Oct 28, 2020

tinaafitz commented Oct 30, 2020

Fryguy commented Oct 30, 2020 •

edited

Loading

djberg96 commented Nov 2, 2020

tinaafitz commented Nov 2, 2020

tinaafitz commented Nov 2, 2020

Fryguy commented Nov 2, 2020

djberg96 commented Nov 2, 2020

miq-bot commented Nov 2, 2020

simaishi commented Nov 9, 2020

djberg96 commented Nov 9, 2020

simaishi commented Nov 9, 2020

dmetzger57 commented Nov 9, 2020

djberg96 commented Nov 9, 2020

simaishi commented Nov 9, 2020

simaishi commented Nov 9, 2020

simaishi commented Nov 9, 2020

Update playbook_log_stdout to handle possible nil job #20738

Update playbook_log_stdout to handle possible nil job #20738

Conversation

djberg96 commented Oct 27, 2020

djberg96 commented Oct 27, 2020

d-m-u Oct 27, 2020

Choose a reason for hiding this comment

djberg96 Oct 27, 2020

Choose a reason for hiding this comment

chessbyte Oct 27, 2020

Choose a reason for hiding this comment

tinaafitz Oct 27, 2020

Choose a reason for hiding this comment

Fryguy Oct 30, 2020

Choose a reason for hiding this comment

Fryguy commented Oct 27, 2020

NickLaMuro left a comment

Choose a reason for hiding this comment

NickLaMuro Oct 27, 2020

Choose a reason for hiding this comment

djberg96 commented Oct 27, 2020 • edited Loading

NickLaMuro commented Oct 27, 2020

tinaafitz commented Oct 27, 2020

djberg96 commented Oct 27, 2020

Fryguy commented Oct 27, 2020

tinaafitz commented Oct 28, 2020

tinaafitz commented Oct 30, 2020

Fryguy commented Oct 30, 2020 • edited Loading

djberg96 commented Nov 2, 2020

tinaafitz commented Nov 2, 2020

tinaafitz commented Nov 2, 2020

Fryguy commented Nov 2, 2020

djberg96 commented Nov 2, 2020

miq-bot commented Nov 2, 2020

simaishi commented Nov 9, 2020

djberg96 commented Nov 9, 2020

simaishi commented Nov 9, 2020

dmetzger57 commented Nov 9, 2020

djberg96 commented Nov 9, 2020

simaishi commented Nov 9, 2020

simaishi commented Nov 9, 2020

simaishi commented Nov 9, 2020

djberg96 commented Oct 27, 2020 •

edited

Loading

Fryguy commented Oct 30, 2020 •

edited

Loading