-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update playbook_log_stdout to handle possible nil job #20738
Conversation
CC @NickLaMuro, @d-m-u |
@@ -35,9 +35,11 @@ def playbook_log_stdout(log_option, job) | |||
return unless %(on_error always).include?(log_option) | |||
return if log_option == 'on_error' && job.raw_status.succeeded? | |||
|
|||
$log.info("Stdout from ansible job #{job.name}: #{job.raw_stdout('txt_download')}") | |||
$log.info("Stdout from ansible job #{job.name}: #{job.raw_stdout('txt_download')}") if job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the event we don't have a job, do we want to log that we're missing a job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tinaafitz, @NickLaMuro, @billfitzgerald0120 What say you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who are the callers of this method and what is a reasonable expectation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chessbyte,
This method gets called from the postprocess state.
https://github.com/ManageIQ/manageiq-content/blob/bb73854dd8d6b1e9647390538ef78046b123f608/content/automate/ManageIQ/Service/Generic/StateMachines/GenericLifecycle.class/__methods__/postprocess.rb#L19
The postprocess state is processed after the playbook has been launched. A valid job is a reasonable expectation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment here: #20738 (comment) . IMO callers should be expected to pass a valid job. I think the only thing that should change in this method is to add a guard clause:
raise ArgumentError, "must pass a valid job" if job.nil?
I'm surprised we don't have a job as we create the job first before we kick off the run. I think we need some more investigation into how we get into this situation before we band-aid it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some thoughts.
rescue StandardError => err | ||
$log.error("Failed to get stdout from ansible job #{job.name}") | ||
msg = "Failed to get stdout from ansible job" | ||
msg << " #{job.name}" if job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things here:
- Would it make sense to rescue
NoMethodError
above here as well? We could avoid a few of theif
calls above by doing that - To address @d-m-u 's concern, could we change the code here to this:
msg = "Failed to get stdout from ansible job #{job ? job.name : '<UknownJob>'}"
That said, I think he concern raised by @chessbyte (here) and @Fryguy (here) are valid, so that might make the above suggesting unnecessary if we determine the root cause for this should normally not happen (I am not sure myself to be honest).
@Fryguy, @NickLaMuro I'm actually curious why there's a rescue clause at all in there. All it does is return or log. So, unless the logging itself fails (because coincidentally the job is nil), I'm curious how it would ever reach that condition. |
Dan and I discussed the issue and I don't think this PR will be necessary. |
FWIW, I added some specs. So, if it turns out we just want a quick fix, this will still be here. |
What was the conclusion? |
Hi @Fryguy, It doesn't make sense that we don't have a valid job here. I'm starting to look at the environment now. |
After further investigation, this code change is important. Here's why:
It's important to note that customers will encounter this issue if they have custom Automate methods in any of the pre* states of this state machine that end in error. Automate State Machine |
To me, then, this feels like the caller to That is, since postprocess seems shared between good and bad runs, it also has to be resilient to good and bad runs. Ultimately postprocess calls manageiq/app/models/service_ansible_playbook.rb Lines 172 to 175 in 647e640
So I think that method should be changed to account for a nil job....something like: def log_stdout(action)
log_option = options.fetch_path(:config_info, action.downcase.to_sym, :log_output) || 'on_error'
job = job(action)
if job.nil?
$log.info("No stdout available due to missing job") # or something to that effect.
else
playbook_log_stdout(log_option, job)
end
end |
@Fryguy ok, updated |
Thanks @Fryguy for pointing that out. That makes sense. |
Thanks @djberg96 |
@djberg96 Can you squash the commits? The commit history now shows a change to the method and then an unchange to the method. |
Verify job exists in conditional. Initial specs. Add another spec. Restore playbook_log_stdout, modify caller instead.
@Fryguy commits squashed. |
Checked commits https://github.com/djberg96/manageiq/compare/deb1ed05d8c3ef1bc8c40197039219b74eb8a5f6~...5a3d48e12890dba5cfa534d07b760c3bc3bf76f9 with ruby 2.6.3, rubocop 0.82.0, haml-lint 0.35.0, and yamllint spec/models/mixins/ansible_playbook_mixin_spec.rb
|
@djberg96 Can this be |
@dmetzger57 What say you? |
@djberg96 I mean the compatibility. Are the changes in this PR compatible with ivanchuk branch that it's ok to backport? |
The change should be ivanchuk/yes if cleanly backportable |
@miq-bot add_label ivanchuk/yes |
Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002) https://bugzilla.redhat.com/show_bug.cgi?id=1835226
Ivanchuk backport details:
|
Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)
Jansa backport details:
|
Update playbook_log_stdout to handle possible nil job (cherry picked from commit 3493002)
Kasparov backport details:
|
Currently the
AnsiblePlaybookMixin#playbook_log_stdout
method assumes that a job will always be present. However, if we look at service_ansible_playbook.rb, we see this call:playbook_log_stdout(log_option, job(action))
And the
job(action)
looks like this:service_resources.find_by(:name => action, :resource_type => 'OrchestrationStack').try(:resource)
We know from the
try
call that this could thus be nil. And, in fact, that's what @d-m-u hit while testing a variant of #20645 where there's a network interruption:So this PR updates the method so that it skips logging
job.name
info if there isn't actually a name.No specs for now, there aren't any for this mixin currently, and I wasn't sure if they were moved somewhere else.