-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Result_traceback should not include job stdout #12961
Conversation
awx/main/tasks/receptor.py
Outdated
# massive, only ask for last 1000 bytes | ||
startpos = max(stdout_size - 1000, 0) | ||
resultsock = receptor_ctl.get_work_results(self.unit_id, startpos=startpos, return_sockfile=True) | ||
lines = resultsock.readlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried about this hanging if the stream doesn't end with a \n
. We may need to put the socket in non-blocking mode before doing this and just read the raw bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some local testing and readlines() seems to be okay with files that don't end in \n. If it gets an EOF it just stops there.
we could also just do read() or read(1000)
as for non-blocking I added resultsock.setblocking(False)
. Hard to test this exactly, but I will make sure this codepath doesn't throw errors
aeb614e
to
94410bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shanemcd's comment may be a valid concern, but this doesn't seem to make it any worse than it was before in that respect, and should prevent some other disaster cases.
dd8458a
to
f6cffa7
Compare
# contain useful information about why the job failed. In case stdout is | ||
# massive, only ask for last 1000 bytes | ||
startpos = max(stdout_size - 1000, 0) | ||
resultsock, resultfile = receptor_ctl.get_work_results(self.unit_id, startpos=startpos, return_socket=True, return_sockfile=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's are the version compatibility issues with this? Does this requires an updated receptor, and does that make it unreasonable to backport? Is there any idea to work on a different fix for backports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the receptor change is minimal, so backporting shouldn't be too hard.
a more minimal version of this that doesn't require a receptor change is to not do get_work_results at all if stdout size > 1000.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so we could merge that and develop that solution you mention and test it separately. Fine by me.
f6cffa7
to
7d7458f
Compare
I really want to get this merged! |
If a job fails, we do receptor work results and put that output into result_traceback. We should only do this if 1. Receptor unit has failed 2. Runner callback processed 0 events Otherwise we risk putting too much data into this field.
7d7458f
to
36c0d07
Compare
related #12644
SUMMARY
If a job fails, we do receptor work results and put that output into result_traceback.
We should only do this if
Otherwise we risk putting too much data into this field.
Would require this receptor change ansible/receptor#672
ISSUE TYPE
COMPONENT NAME
AWX VERSION