-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build and test pipeline takes too long #12028
Comments
This makes sense to me. Especially as you say:
Before choosing one solution, I want to propose an alternative:
|
Yep this is what I meant with |
ZPA triage summary:
|
ZDP planning:
|
12402: Aggregate equivalent client streams r=deepthidevaki a=deepthidevaki ## Description Aggregates client streams with the same `streamType` and `metadata` to a single stream that is registered with the server. Payloads pushed to this server stream will be distributed to one of the registered client stream. Currently, the client stream is picked randomly. Later, we can employ a better strategy to chose the client stream to push the payload. The streamId for the aggregated stream is generated randomly. When all existing client streams are removed, the corresponding server stream will also be removed. When a new client stream is registered with the same streamType and metadata, a new aggregated stream will be created with a new streamId. This is ok, because as long there is an aggregated stream in the registry, new client streams will be added to it. Not re-using the previous streamId also helps to prevent edge cases where concurrent remove (of old aggregated stream) and add (of new aggregated stream) requests caused by retried resulting in inconsistent state. ## Related issues closes #12253 12406: ci(integration): split module and integration test jobs r=megglos a=megglos ## Description Reduces overall runtime as the module tests and qa-integration tests cannot run in parallel due to the maven module inter-dependencies. Thus extracting module ITs into a dedicated job allows us to get the overall IT stages down to ~ 10 minutes, while on main the `Integration tests` job that combines module and qa integration tests shows runtimes of about 15m. By that chance introduced a shared IT job setup that can be configured through a matrix. By also looking at the unit test summary I wondered why the s3 unit tests take about 2m to complete, which is where I found that some ITs were actually run as unit tests. I made sure they are run as ITs going forward [by renaming them](3b5781f). In total the whole CI run duration is not dominated by the integration test job anymore but by multiple that oscillate around 10m. relates to #12028 12421: feat: default to a better raft request timeout r=oleschoenburg a=oleschoenburg Using the old default values of: ```yaml zeebe.broker: cluster: electionTimeout: 2.5s raft: enablePriorityElection: true experimental: maxAppendsPerFollower: 2 raft: requestTimeout: 5s ``` the loss of 2 requests between primary(leader) and secondary(follower) could trigger unnecessary re-election because the secondary would not receive any requests from the primary for at least 5 seconds which exceeds election timeout. This changes the default request timeout to always match the default election timeout. Using all default values, we get at least one more request attempt between primary and secondary before re-election and probably more, depending on the exact timing when requests are sent. closes #12009 12426: Disable test results comment r=npepinpe a=remcowesterhoud ## Description <!-- Please explain the changes you made here. --> Disables the test results comment that gets added to PRs. As a team it was decided this was not useful. ## Related issues <!-- Which issues are closed by this PR or are related --> closes # 12428: test(qa): save logs of zeebe containers if the test fails r=deepthidevaki a=deepthidevaki ## Description There were no logs from the brokers or gateway. So it was not possible to debug flaky test #12396 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com> Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Sebastian Bathke (Meggle) <sebastian.bathke@camunda.com> Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com> Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
12430: [Backport stable/8.2] ci(integration): split module and integration test jobs r=megglos a=backport-action # Description Backport of #12406 to `stable/8.2`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Sebastian Bathke (Meggle) <sebastian.bathke@camunda.com>
12430: [Backport stable/8.2] ci(integration): split module and integration test jobs r=megglos a=backport-action # Description Backport of #12406 to `stable/8.2`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Sebastian Bathke (Meggle) <sebastian.bathke@camunda.com>
12430: [Backport stable/8.2] ci(integration): split module and integration test jobs r=megglos a=backport-action # Description Backport of #12406 to `stable/8.2`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Sebastian Bathke (Meggle) <sebastian.bathke@camunda.com>
12430: [Backport stable/8.2] ci(integration): split module and integration test jobs r=megglos a=backport-action # Description Backport of #12406 to `stable/8.2`. relates to #12028 12445: deps(maven): bump grpc-bom from 1.54.0 to 1.54.1 r=github-actions[bot] a=dependabot[bot] Bumps [grpc-bom](https://github.com/grpc/grpc-java) from 1.54.0 to 1.54.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/grpc/grpc-java/releases">grpc-bom's releases</a>.</em></p> <blockquote> <h2>v1.54.1</h2> <h2>Bug Fixes</h2> <ul> <li>core: Fix NPE race during hedging (<a href="https://redirect.github.com/grpc/grpc-java/pull/9853">grpc/grpc-java#9853</a>), fixing a Netty buffer memory leak for cancelled RPCs</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/grpc/grpc-java/commit/56d1c63802fd9706fece4b144969bfc9bdcfc99c"><code>56d1c63</code></a> Bump version to 1.54.1</li> <li><a href="https://github.com/grpc/grpc-java/commit/4a5605cea3d7cfc9196f35a36a7c60a5ce926eb2"><code>4a5605c</code></a> Update README etc to reference 1.54.1</li> <li><a href="https://github.com/grpc/grpc-java/commit/4b01c907bc72ca513b0f099fac58601addbe2dd3"><code>4b01c90</code></a> core: Fix NPE race during hedging</li> <li><a href="https://github.com/grpc/grpc-java/commit/92b4faed40344af6941e6ff506bd0670173e42e1"><code>92b4fae</code></a> Pass interop parameters to each langs run.sh as-is. run.sh should just pass t...</li> <li><a href="https://github.com/grpc/grpc-java/commit/1bf518af12f66322bd9bae63c667314a0ca41c94"><code>1bf518a</code></a> gcp-o11y: Remove monitored resource detection for logging (<a href="https://redirect.github.com/grpc/grpc-java/issues/10020">#10020</a>)</li> <li><a href="https://github.com/grpc/grpc-java/commit/5369df13a5e24c0e011e75e0f4ac59ab3835cec2"><code>5369df1</code></a> Removes the ExperimentalApi annotation from GcpObservability.</li> <li><a href="https://github.com/grpc/grpc-java/commit/6d21d71a257beaddf4805af27ed1d7f77ce86f2b"><code>6d21d71</code></a> use glob for example file names which is used in updating release versions (#...</li> <li><a href="https://github.com/grpc/grpc-java/commit/e955afe50a8ce720f89c0a58c85cc80038a1ce65"><code>e955afe</code></a> examples: Fix grpc version in gcp-observability</li> <li><a href="https://github.com/grpc/grpc-java/commit/5c09616aae5ff1620ed6bb5867717e0451c2d00f"><code>5c09616</code></a> xds: fix flaky wrr test (<a href="https://redirect.github.com/grpc/grpc-java/issues/10005">#10005</a>)</li> <li><a href="https://github.com/grpc/grpc-java/commit/7d5d25d34e20259962fb6bf0dbe501df761c6948"><code>7d5d25d</code></a> Bump version to 1.54.1-SNAPSHOT</li> <li>See full diff in <a href="https://github.com/grpc/grpc-java/compare/v1.54.0...v1.54.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - ``@dependabot` rebase` will rebase this PR - ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it - ``@dependabot` merge` will merge this PR after your CI passes on it - ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it - ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging - ``@dependabot` reopen` will reopen this PR if it is closed - ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Sebastian Bathke (Meggle) <sebastian.bathke@camunda.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
12449: [Backport stable/8.0] ci(integration): split module and integration test jobs r=npepinpe a=megglos # Description Backport of #12406 to `stable/8.1`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
12441: deps(maven): bump maven-dependency-analyzer from 1.13.0 to 1.13.1 r=npepinpe a=dependabot[bot] Bumps [maven-dependency-analyzer](https://github.com/apache/maven-dependency-analyzer) from 1.13.0 to 1.13.1. <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/54a53397f1f286804cb9ee340ed0243df4d00eb0"><code>54a5339</code></a> [maven-release-plugin] prepare release maven-dependency-analyzer-1.13.1</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/9c05199709db540441a05578d4727d2a3b6326f8"><code>9c05199</code></a> [MSHARED-1149] Replace System.out by logger</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/93f9e364970d50e4afea06ec0280991070d81d24"><code>93f9e36</code></a> [MSHARED-1224] Prefer JDK classes to Plexus utils (<a href="https://redirect.github.com/apache/maven-dependency-analyzer/issues/81">#81</a>)</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/0cfe70fcbc00a57c10fbf82890ffdd925d284bad"><code>0cfe70f</code></a> [MSHARED-1205] Build on JDK 19, 20</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/9739f21104175ba0fd7c5b53d088cf21618ae8d9"><code>9739f21</code></a> [MSHARED-1219] Upgrade Parent to 39 - code reformat</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/0ce92bbcb8bd9b9a6efc4be21d1b4a8bee8ab2db"><code>0ce92bb</code></a> [MSHARED-1219] Upgrade Parent to 39</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/f72c12244669a8fb4a0a8c37d7adc9088506a737"><code>f72c122</code></a> [MSHARED-1220] Refresh download page</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/2ce51238d78ddf2093455cc763f81c5a7f66ea99"><code>2ce5123</code></a> Disable merge button, add jira autolink</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/5155be74246574895ede47aa10f8594ce3511075"><code>5155be7</code></a> [MSHARED-1218] Bump asm from 9.4 to 9.5 (<a href="https://redirect.github.com/apache/maven-dependency-analyzer/issues/83">#83</a>)</li> <li><a href="https://github.com/apache/maven-dependency-analyzer/commit/03725618e1d03d7a9d0ac9b83dd1286123f8c0b3"><code>0372561</code></a> Bump asm from 9.3 to 9.4 (<a href="https://redirect.github.com/apache/maven-dependency-analyzer/issues/70">#70</a>)</li> <li>Additional commits viewable in <a href="https://github.com/apache/maven-dependency-analyzer/compare/maven-dependency-analyzer-1.13.0...maven-dependency-analyzer-1.13.1">compare view</a></li> </ul> </details> <br /> [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting ``@dependabot` rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - ``@dependabot` rebase` will rebase this PR - ``@dependabot` recreate` will recreate this PR, overwriting any edits that have been made to it - ``@dependabot` merge` will merge this PR after your CI passes on it - ``@dependabot` squash and merge` will squash and merge this PR after your CI passes on it - ``@dependabot` cancel merge` will cancel a previously requested merge and block automerging - ``@dependabot` reopen` will reopen this PR if it is closed - ``@dependabot` close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - ``@dependabot` ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - ``@dependabot` ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> 12449: [Backport stable/8.0] ci(integration): split module and integration test jobs r=npepinpe a=megglos # Description Backport of #12406 to `stable/8.1`. relates to #12028 Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com> Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
12449: [Backport stable/8.0] ci(integration): split module and integration test jobs r=npepinpe a=megglos # Description Backport of #12406 to `stable/8.1`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
12449: [Backport stable/8.0] ci(integration): split module and integration test jobs r=megglos a=megglos # Description Backport of #12406 to `stable/8.1`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
12449: [Backport stable/8.0] ci(integration): split module and integration test jobs r=megglos a=megglos # Description Backport of #12406 to `stable/8.1`. relates to #12028 Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com> Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
Given with recent improvements we brought the CI stage down to about 10m I would conclude the first iteration scope complete. See e.g. https://github.com/camunda/zeebe/actions/runs/5068424932?pr=12850 I would still consider #12417 worth doing anytime soon to ensure the forking of IT jobs is efficient. Let's check in on how to proceed. |
ZDP-Triage:
|
Issue Description
The current compilation time with maven
mvn clean install -DskipTests -T1C
is for me locally ~1.5 min, but this highly depends on the machine. The execution of all the tests is way too long and I never do it locally (which is bad). If I do changes I normally just run the test in the affected module.The CI pipeline takes ~33 min, to build, test and deploy snapshots. See this 33m 28s.
If we do a calculation of how many engineers we are working on this project ~9 and say everyone is pushing at least once to a branch we run 33 * 9 min = 297 min per day. If we say we have around 200 working days in a year, we have 59400 min CI time, only for our work.
This doesn't include daily builds, dependabot (which are a lot), stable branches, etc.
If we are reducing the test and compile time, this will have a major effect on costs (since we using self-managed runners) but also gives us a boost in effectivity, since we getting faster feedback.
Potential Idea:
I think there are several ideas to tackle this issue, improving the tests, reducing integration tests and increasing unit test coverage, running partial builds, etc.
But there is also another thing that might be interesting for us to investigate. We have several modules we haven't touched for years. There are also modules that we touch only from time to time, maybe every quarter once.
I would propose we release one version of each of these modules and pin them in our dependency. This would allow us to skip the compiling and running tests of these modules on every build. For example, it is not necessary to compile and run tests every time for
Zeebe Msgpack Core
,Zeebe Msgpack Value
, or others.I think it is important that if we change one of these modules we directly release a new version of it. IMHO that is ok since these modules are and should only be used internally. As long we keep our public release intact and release
dist
andclients
with the correct release number this should be fine. This would decrease the time of the general CI pipeline by a lot.BTW When we started with Zeebe (or TNGP) we had each module its own REPO, but we realized that since our APIs were so fragile we had to change a lot on multi modules so it didn't make much sense to have them in several repos and release them every time (or use snapshot). But I think this has changed since a lot of modules are more mature now.
First iteration scope
Optimize overhead of test jobs => target max time 10m
Tasks
The text was updated successfully, but these errors were encountered: