Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky-test: C++ test BasicEndToEndTest.testLookupThrottling #11760

Closed
lhotari opened this issue Aug 24, 2021 · 2 comments · Fixed by #11762
Closed

Flaky-test: C++ test BasicEndToEndTest.testLookupThrottling #11760

lhotari opened this issue Aug 24, 2021 · 2 comments · Fixed by #11762

Comments

@lhotari
Copy link
Member

lhotari commented Aug 24, 2021

BasicEndToEndTest.testLookupThrottling is flaky.

example failure

try #1

�[0;33mNote: Google Test filter = BasicEndToEndTest.testLookupThrottling
�[m�[0;32m[==========] �[mRunning 1 test from 1 test case.
�[0;32m[----------] �[mGlobal test environment set-up.
�[0;32m[----------] �[m1 test from BasicEndToEndTest
�[0;32m[ RUN      ] �[mBasicEndToEndTest.testLookupThrottling
2021-08-24 09:22:29.917 DEBUG [140347937134592] ReaderTest:126 | Using Binary Lookup
2021-08-24 09:22:29.917 INFO  [140347937134592] ReaderTest:181 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
2021-08-24 09:22:29.918 INFO  [140347937134592] ReaderTest:84 | Created connection for pulsar://localhost:6650
2021-08-24 09:22:29.919 DEBUG [140347937134592] ReaderTest:514 | [<none> -> pulsar://localhost:6650] Resolving localhost:6650
2021-08-24 09:22:29.919 DEBUG [140347818075904] ReaderTest:542 | [<none> -> pulsar://localhost:6650] Connecting to 127.0.0.1:6650...
2021-08-24 09:22:29.920 DEBUG [140347818075904] ReaderTest:546 | [<none> -> pulsar://localhost:6650] Resolved hostname localhost to 127.0.0.1:6650
2021-08-24 09:22:29.920 INFO  [140347818075904] ReaderTest:366 | [127.0.0.1:51390 -> 127.0.0.1:6650] Connected to broker
2021-08-24 09:22:29.926 DEBUG [140347818075904] ReaderTest:742 | [127.0.0.1:51390 -> 127.0.0.1:6650] Handling incoming command: CONNECTED
2021-08-24 09:22:29.927 DEBUG [140347818075904] ReaderTest:265 | Connection has max message size setting: 5242880
2021-08-24 09:22:29.927 DEBUG [140347818075904] ReaderTest:267 | Current max message size is: 5242880
2021-08-24 09:22:29.927 DEBUG [140347818075904] ReaderTest:151 | PartitionMetadataLookup failed for persistent://public/default/testLookupThrottling, result TooManyLookupRequestException
2021-08-24 09:22:29.927 ERROR [140347818075904] ReaderTest:188 | Error Checking/Getting Partition Metadata while creating producer on persistent://public/default/testLookupThrottling -- TooManyLookupRequestException
2021-08-24 09:22:29.927 INFO  [140347937134592] ReaderTest:88 | Subscribing on Topic :testLookupThrottling
2021-08-24 09:22:29.927 DEBUG [140347937134592] ReaderTest:69 | Got connection from pool for pulsar://localhost:6650 use_count: 4 @ 0xe1b0d0
2021-08-24 09:22:29.927 DEBUG [140347937134592] ReaderTest:151 | PartitionMetadataLookup failed for persistent://public/default/testLookupThrottling, result TooManyLookupRequestException
2021-08-24 09:22:29.927 ERROR [140347937134592] ReaderTest:392 | Error Checking/Getting Partition Metadata while Subscribing on persistent://public/default/testLookupThrottling -- TooManyLookupRequestException
2021-08-24 09:22:29.927 INFO  [140347937134592] ReaderTest:492 | Closing Pulsar client with 0 producers and 0 consumers
2021-08-24 09:22:29.927 DEBUG [140347937134592] ReaderTest:536 | Shutting down producers and consumers for client
2021-08-24 09:22:29.927 INFO  [140347937134592] ReaderTest:1495 | [127.0.0.1:51390 -> 127.0.0.1:6650] Connection closed

and try #2

�[0;33mNote: Google Test filter = BasicEndToEndTest.testLookupThrottling
�[m�[0;32m[==========] �[mRunning 1 test from 1 test case.
�[0;32m[----------] �[mGlobal test environment set-up.
�[0;32m[----------] �[m1 test from BasicEndToEndTest
�[0;32m[ RUN      ] �[mBasicEndToEndTest.testLookupThrottling
2021-08-24 09:22:30.328 DEBUG [140233890162688] ReaderTest:126 | Using Binary Lookup
2021-08-24 09:22:30.329 INFO  [140233890162688] ReaderTest:181 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
2021-08-24 09:22:30.329 INFO  [140233890162688] ReaderTest:84 | Created connection for pulsar://localhost:6650
2021-08-24 09:22:30.330 DEBUG [140233890162688] ReaderTest:514 | [<none> -> pulsar://localhost:6650] Resolving localhost:6650
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:542 | [<none> -> pulsar://localhost:6650] Connecting to 127.0.0.1:6650...
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:546 | [<none> -> pulsar://localhost:6650] Resolved hostname localhost to 127.0.0.1:6650
2021-08-24 09:22:30.331 INFO  [140233771104000] ReaderTest:366 | [127.0.0.1:51408 -> 127.0.0.1:6650] Connected to broker
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:742 | [127.0.0.1:51408 -> 127.0.0.1:6650] Handling incoming command: CONNECTED
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:265 | Connection has max message size setting: 5242880
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:267 | Current max message size is: 5242880
2021-08-24 09:22:30.331 DEBUG [140233771104000] ReaderTest:151 | PartitionMetadataLookup failed for persistent://public/default/testLookupThrottling, result TooManyLookupRequestException
2021-08-24 09:22:30.331 ERROR [140233771104000] ReaderTest:188 | Error Checking/Getting Partition Metadata while creating producer on persistent://public/default/testLookupThrottling -- TooManyLookupRequestException
2021-08-24 09:22:30.332 INFO  [140233890162688] ReaderTest:88 | Subscribing on Topic :testLookupThrottling
2021-08-24 09:22:30.332 DEBUG [140233890162688] ReaderTest:69 | Got connection from pool for pulsar://localhost:6650 use_count: 4 @ 0x18c50d0
2021-08-24 09:22:30.332 DEBUG [140233890162688] ReaderTest:151 | PartitionMetadataLookup failed for persistent://public/default/testLookupThrottling, result TooManyLookupRequestException
2021-08-24 09:22:30.332 ERROR [140233890162688] ReaderTest:392 | Error Checking/Getting Partition Metadata while Subscribing on persistent://public/default/testLookupThrottling -- TooManyLookupRequestException
2021-08-24 09:22:30.332 INFO  [140233890162688] ReaderTest:492 | Closing Pulsar client with 0 producers and 0 consumers
2021-08-24 09:22:30.332 DEBUG [140233890162688] ReaderTest:536 | Shutting down producers and consumers for client
2021-08-24 09:22:30.332 INFO  [140233890162688] ReaderTest:1495 | [127.0.0.1:51408 -> 127.0.0.1:6650] Connection closed
@lhotari
Copy link
Member Author

lhotari commented Aug 24, 2021

@BewareMyPower do you have a solution for this flaky C++ test?

@BewareMyPower
Copy link
Contributor

Current not. From the existing logs, it's because C++ client encountered segmentation fault during client's shutdown(), more accurately, the ExecutorProvider's close phase (but there's no log).

ioExecutorProvider_->close();
listenerExecutorProvider_->close();
partitionListenerExecutorProvider_->close();

for (ExecutorList::iterator it = executors_.begin(); it != executors_.end(); ++it) {
if (*it != NULL) {
(*it)->close();
}
it->reset();
}

The only clue that I found is that ExecutorServiceProvider::close is not thread safe because executors_ is protected by mutex_ in ExecutorServiceProvider::get while mutex_ is not locked in close method.

I'll fix the theoretical thread-safe problem soon, but not guarantee it can fix the flaky test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants