-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Conversation
actual_stdout_lines = actual_stdout.split('\n') | ||
with open(expected_stdout_path) as f: | ||
expected_stdout = f.read() | ||
for expected_line in expected_stdout.split('\n'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in your tests, have you found this op to take a while? it seems inefficient complexity-wise but I don't have a suggestion for anything better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the inefficiency - the desired output is only a few dozen lines, and in practice the quicker of the two tests that uses this takes roughly a quarter of a second total to complete
@@ -408,3 +408,26 @@ def _send_agent_message( | |||
"episode_done": False, | |||
} | |||
self.server.send_agent_act(agent_id=agent_id, act_content=act_content) | |||
|
|||
|
|||
def check_stdout(actual_stdout: str, expected_stdout_path: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would strongly advise we use a pytest regression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, normally I would do a pytest regression for this. However, this is a special case, because we want to check the stdout against some reference stdout, but the actual stdout will often but not always include strings representing progress indicators - for instance, see the "40%|#### | 4/10 [00:00<00:00, 15.52it/s]
" string at https://app.circleci.com/pipelines/github/facebookresearch/ParlAI/8467/workflows/5621488b-c27a-46e8-a76e-3993dee8dd59/jobs/68923 . Thus, this PR checks the actual stdout against the reference stdout in a way that is tolerant to these extra strings, but as a result doesn't use pytest regressions
* Start refactoring test * Consolidate stdout-checking code * Fixes * Revert dependency * Fixes
Patch description
Fix one source of flakiness in the model-chat-analysis crowdsourcing CI check, from if a progress bar unexpectedly and irreproducibly prints an update in the stdout, which we want to check against the desired stdout. (See https://app.circleci.com/pipelines/github/facebookresearch/ParlAI/8467/workflows/5621488b-c27a-46e8-a76e-3993dee8dd59/jobs/68923 for an example of this.) This fix involves checking simply that each line of the desired stdout is present in the actual stdout, instead of requiring that the stdouts be exactly the same.
As part of the refactor of the CI check code, the analysis checks for the model-chat and static-turn-annotations tasks are brought into closer alignment, since the code for these similar checks were written in quite different ways.
Testing steps
The following checks were modified:
pytest tests/crowdsourcing/tasks/turn_annotations_static/test_turn_annotations_static_analysis.py
pytest tests/crowdsourcing/tasks/model_chat/test_model_chat_analysis.py