-
-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: Add xfail_windows
decorator for failing tests on Windows
#4362
Conversation
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…est_grass_tmp_mapset.py
This reverts commit 20eb947.
I’m kind of proud of this one. Now we really start working on something. Like individually integrating changes from #2616, knowing for sure what changes impact what tests (because we would remove the decorator one by one). |
Another important advantage: On Windows, it is the only place where v.in.pdal tests are run at all, since they are excluded on macOS and Linux, and it is the only place where they are all passing. After this PR, it will then be enforced to always pass. |
v.in.pdal tests have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great progress!
They already have to be handled in a special way, and it is not clear to me what exactly would be the unit to disable (documentation block?). Originally, there was an idea to use doctest for testing, now perhaps it more for testing that the examples in the documentation are up to date and for that, run on one platform would be enough. I think we didn't have much discussion about that lately.
While at some point there was an idea that one shell script can be more than one test, I think these are the "simple" tests, so one script - one test is what it should be, no need to disable on a finer level. It is not clear to me how to approach the scripts from cross-platform perspective. My idea for the functionality was to accommodate existing test scripts in Bash or sh and people who want to write tests as shell scripts.
Why don't you exclude the failing files, then you don't have to set success percent anymore. The success percent setting was just trying to compensate for the missing file exclusion feature. Now that's in place (and even the better per-function expected failures on Windows), so I don't see a reason to use it anymore. |
Oh, now I see.. I was sooo surprised it didn't complain at all.. |
I didn't know either. That's why I kept it mostly out of scope for here. But at the same time, we were able in the recent past to see some failures that were mostly unique here, as the packages in OSGeo4W are mostly very close to the latest releases, very early on. So all the failures related to the latest Python, the latest numpy having different output, or the GEOS format changing, we all see them first here technically if it runs.
I also understand that the goal was to accommodate existing tests. On windows, it's harder, as we need to have msys just for that. I did a lot of iterations (not exactly on this), but using the already packaged OSGeo4W grass-dev and running tests from the repo towards the installed package. Apart from saving me almost 15 minutes per iteration, it also separated the possible error causes. In these, I struggled a bit on finding why the * didn't the shell tests weren't picking up everything that was supposed to be set. It turns out the second build osgeo4w shell script, only used in our current CI diverged a bit from OSGeo4W and did a little bit of magic and added msys2 paths everywhere so it was tightly coupled and ended up working. I found a way more limited way of placing only the needed PATHs at one place, and works fine. But all that to say, shell script tests are a little pain cross-platform and should be avoided at some point, and are way slower.
Once merged, I'll see if it's reasonable to place a floating point variable instead. It will just make it easier. Once merged, it's easier and safer to do refactorings on the test side. Pytest running unittests are getting closer. Last time I checked, it was mostly the part where test data files were copied and made available to the test cases that was missing. Most of the rest was working fine. So playing with the gunittest reporting won't be needed at that point. So its not worth taking toooooo much time on that. |
After responding to all the comments of @wenzeslaus, I think it is safe to assume that no changes will be required for it in this PR, so I'll go forward with it. |
This PR is a big step towards being able to safely refactor tests by having almost all tests (see footnote) being ran and checked on Windows, and any failure making the tests fail. The success in number of files is now of 96.8%.
The approach used here is to make all current Windows test failures being expected failures, thus allowing to re-enable all excluded test files. It also allows for files where only one or more tests were failing, to now be non failing and thus have all the other working tests be required to be passing.
If ever a previously failing test with an expected failure will end up passing, it will count as a failure as it is an unexpected success. This is important, as it is what allows us to know when we impacted a test. Skipping a complete file isn’t an appropriate solution, as we won’t know the results of all the other good tests of the file.
Plain unittest can have conditions in the decorator skip or skipIf, but none exists for expected failures. So, a new decorator was created in the utils file, that wraps the logic, and adds a warning with explanations when the test ends up applying the expected failure.
The approach to fixing the tests is to find a usage of the xfail_windows decorator, fix the test, and remove the decorator. If we happen to modify the behavior of something that ends up making a Windows test pass (fails as it is an unexpected success), then we also remove the xfail_windows decorator, and after that this test is not allowed to fail any more.
We will now be way more in control of what affects our tests on Windows.
Side note: I have some great progress on running Pytest on the Windows CI, it now runs, but still 15 failures, but not as much as when things like cannot load grass_gis.8.5 library were happening. So it’s much better.
Footnotes:
>=
) This still allows one file to vary without us noticing.