This repository has been archived by the owner on Jan 24, 2024. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Speed up tests and fix flakiness (#628)
### Motivation The KoP CI tests take much more time than CI tests of branch-2.7.2. The main reason is the `cleanup()` phase takes a long time, each time a test is cleaned up, it will take over 10 seconds to complete. This behavior was introduced from apache/pulsar#10199, which made broker shutdown gracefully by default but it would take longer to shutdown. The other reason is caused by rebalance time. According to my observes, when a Kafka consumer subscribes a topic in KoP, it will take at least 3 seconds. Finally I found it's caused by the GroupInitialRebalanceDelayMs config, which has the same semantics with Kafka's [group.initial.rebalance.delay.ms](https://kafka.apache.org/documentation/#brokerconfigs_group.initial.rebalance.delay.ms). It makes Kafka server wait longer for `JOIN_GROUP` request for more consumers to join so that the rebalance count can reduce. However, it should be set zero in tests. After fixing these problems, sometimes the following error may happen and cause flakiness. ``` TwoPhaseCompactor$MockitoMock$1432102834 cannot be returned by getConfiguration() getConfiguration() should return ServiceConfiguration *** If you're unsure why you're getting above error read on. Due to the nature of the syntax above problem might occur because: 1. This exception *might* occur in wrongly written multi-threaded tests. Please refer to Mockito FAQ on limitations of concurrency testing. 2. A spy is stubbed using when(spy.foo()).then() syntax. It is safer to stub spies - - with doReturn|Throw() family of methods. More in javadocs for Mockito.spy() method. ``` It's because `PulsarService#newCompactor` is not mocked well, see apache/pulsar#7102 for detail. ### Modifications - Configure `GroupInitialRebalanceDelayMs` and `BrokerShutdownTimeoutMs` for each mocked `BrokerService`. - Fix the flakiness caused by mocking compactor. After the changes, the tests time has a significant improvement. For example, `GroupCoordinatorTest` takes only 3 minutes now but it could take 9 minutes before. Because the `cleanup()` method is marked as `@AfterMethod` and would be called each time a single test finished. Another example is that `BasicEndToEndKafkaTest` takes only 37 seconds now but it could take 56 seconds before. The `cleanup()` is marked as `@AfterClass` and only happens once, but many consumers will be created during the whole tests and each time a subscribe call can take 3 seconds. * Speed up tests and fix flakiness * Speed up tests that creat their own configs * Ignore golang-sarama tests
- Loading branch information