-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
code coverage reports are flappy due to non-deterministic tests #1139
Comments
Proposed Fix for #1139. Given that we're only executing specific GitHub actions based on file changes, shared codecov flags might fluctuate between PRs due to variable testing targets. While this PR retains the recommended flag structure for monorepos, it aims to eliminate potential overlaps. However, overlap could still occur with Go versions like 1.19, 1.20, and so on. Personally, I suggest we retain this overlap for now to gauge its potential inconsistencies. If issues arise, appending the Go version as a suffix to the flag could be considered. Admittedly, I'm charting unfamiliar waters with this solution. Insights from those experienced with Codecov would be greatly appreciated. Signed-off-by: Manfred Touron <94029+moul@users.noreply.github.com>
It appears my attempt at #1140 hasn't gone as planned. If anyone has experience with configuring codecov for monorepos and conditional GitHub actions, your insights would be greatly appreciated. |
Split coverages by component. Related to #1139 - [x] Added new tests, or not needed, or not feasible - [x] Provided an example (e.g. screenshot) to aid review or the PR is self-explanatory - [x] Updated the official documentation or not needed - [x] No breaking changes were made, or a `BREAKING CHANGE: xxx` message was included in the description - [x] Added references to related issues and PRs - [x] Provided any useful hints for running manual tests - [x] Added new benchmarks to [generated graphs](https://gnoland.github.io/benchmarks), if any. More info [here](https://github.com/gnolang/gno/blob/master/.benchmarks/README.md). --------- Signed-off-by: Antonio Navarro Perez <antnavper@gmail.com>
Fixing this: ![CleanShot 2023-09-18 at 16 29 59](https://github.com/gnolang/gno/assets/94029/f0139de3-f1d6-412d-9ba9-4f897bf4d2f3) Related with #1139 Signed-off-by: Manfred Touron <94029+moul@users.noreply.github.com>
Codecov seems to be mostly stable now, so this issue is fixed at least in its most "severe form". I'd still like to keep this open, I've modified the OP for what I think are the current issues, as we're still experiencing some small form of non-determinism in codecov reports. |
Related with #1215. |
Partly due to the fact that our testing is currently non-deterministic, there are some instances where some code may or may not be executed "randomly". This can be seen by checking out the codecov report for any trivial PR which doesn't touch code (such as action updates):
(Note: these are not permalinks, so they may have changed if new commits have been added)
We should try to find the cases where code is executed non-deterministically, and either (1) remove the source of nondeterminism in our test or, where appropriate, (2) unit-test the function so that it is always tested regardless of non-determinism.
Original issue description
Recently, we made codecov a requirement to block PRs that reduce coverage, as detailed in this discussion. This decision was implemented in PR #1120. However, I mistakenly configured codecov, an issue later rectified in PR #1137.
In the interim, @thehowl observed a peculiar behavior, highlighted in PR #1122, which bore similarities to the issues stemming from my codecov misconfiguration.
Interestingly, one of my recent PRs, despite not altering the covered tests, led codecov to report a 16% decrease in coverage. This occurred in PR #1138. This discrepancy could be due to unstable values or potentially another underlying issue. It's worth considering if we should merge a new PR to discern the actual results.
There's a significant possibility that the inconsistency is attributed to an incorrect configuration. Our usage of codecov is a bit more intricate since we operate within a monorepo. This setup contains multiple independent unit tests that progressively send their results to codecov. Occasionally, certain PRs will only upload data for a specific subpackage, excluding the entire project.
The core issue might be that while our uploads are accurate, codecov anticipates full coverage results, even for untested packages. If this hypothesis is correct, we could potentially resolve it by adjusting our configuration settings. Alternatively, we might consider a workaround: storing the latest successful coverage from the master branch as an artifact. This stored data could then be forwarded in instances where a test suite is skipped and cannot generate real-time data.
cc @thehowl @ajnavarro @zivkovicmilos
The text was updated successfully, but these errors were encountered: