Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: fix flaky test for blocking pull shutdown #378

Merged
merged 1 commit into from
Apr 15, 2021

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Apr 14, 2021

Fixes #372.

If a test is run in a suite with other system tests, the messages are not always published in batch sizes as desired, which can affect how ACKs are handled on the backend (the server requires all messages published in a single batch to be ACK-ed in order
to accept the ACKs).

If a publisher client instance is shared between the tests, the batching can apparently be affected, thus we create a new client
instance before each test. Since these tests are slow system tests, the overhead should not be significant.

Steps to verify

Apply the following patch to the code, disable capturing test output (-s option to the py.test command).

diff --git google/cloud/pubsub_v1/publisher/_batch/thread.py google/cloud/pubsub_v1/publisher/_batch/thread.py
index 36dd3b9..c3aacdf 100644
--- google/cloud/pubsub_v1/publisher/_batch/thread.py
+++ google/cloud/pubsub_v1/publisher/_batch/thread.py
@@ -211,6 +211,7 @@ class Batch(base.Batch):
         commit_thread = threading.Thread(
             name="Thread-CommitBatchPublisher", target=self._commit
         )
+        print(f"batch starting commit for {len(self._messages)} message(s)")
         commit_thread.start()

     def _commit(self):

Run system tests in --verbose mode and check the output.

Actual result (before the PR fix)
The system test test_streaming_pull_blocking_shutdown fails with a high probability.

Expected result

The system test test_streaming_pull_blocking_shutdown consistently passes.

In addition, wherever the _publish_messages(..., batch_sizes=[...]) helper is used in the tests, the output size output should match the values passed through batch_sizes. For example, in test_streaming_pull_max_messages(), batch_sizes is set to (7, 4, 8, 2, 10, 1, 3, 8, 6, 1), and the test output matches that:

...
tests/system.py::TestStreamingPull::test_streaming_pull_max_messages batch starting commit for 7 message(s)
batch starting commit for 4 message(s)
batch starting commit for 8 message(s)
batch starting commit for 2 message(s)
batch starting commit for 10 message(s)
batch starting commit for 1 message(s)
batch starting commit for 3 message(s)
batch starting commit for 8 message(s)
batch starting commit for 6 message(s)
batch starting commit for 1 message(s)
...

PR checklist:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

If a test is run in a suite with other system tests, the messages
are not always published in batch sizes as desired, which can
affect how ACKs are handled on the backend (the server requires
all messages published in a single batch to be ACK-ed in order
to accept the ACKs).

If a publisher client instance is shared between the tests, the
batching can apparently be affected, thus we create a new client
instance before each test.

Since these tests are slow system tests, the overhead should not
be significant.
@plamut plamut requested a review from a team as a code owner April 14, 2021 13:12
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Apr 14, 2021
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the googleapis/python-pubsub API. label Apr 14, 2021
@plamut plamut requested review from pradn and jimfulton April 14, 2021 13:31
@plamut
Copy link
Contributor Author

plamut commented Apr 14, 2021

@jimfulton If you have time and because you are now somewhat familiar with these tests, you too can check this.

Although you said that this particular test was not failing for you locally?

@jimfulton
Copy link
Contributor

Right, I haven't seen this failure.

Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the rationale for the fix is that subscriber client state was leaking between tests and causing spurious failures.

@plamut plamut merged commit 13a6686 into googleapis:master Apr 15, 2021
@plamut plamut deleted the iss-372 branch April 15, 2021 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the googleapis/python-pubsub API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The test tests.system.test_streaming_pull_blocking_shutdown is flaky
2 participants