Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Tests have global side-effects. #11285

Open
picnixz opened this issue Apr 4, 2023 · 8 comments
Open

[BUG] Tests have global side-effects. #11285

picnixz opened this issue Apr 4, 2023 · 8 comments

Comments

@picnixz
Copy link
Member

picnixz commented Apr 4, 2023

Describe the bug

While implementing some feature, I wanted to test the latter by selectively running some tests but I found that the tests are not executed in an isolated context, namely they have undesirable side-effects.

A simple way to see that is by manually running the following tests:

python -m tox -e py310 \
  -- -s --no-header --no-summary \
  tests/test_ext_autodoc.py::test_autodoc_typed_instance_variables
python -m tox -e py310 \
  -- -s --no-header --no-summary \
  tests/test_ext_autodoc_autoattribute.py::test_autoattribute_typed_variable_in_alias \
  tests/test_ext_autodoc.py::test_autodoc_typed_instance_variables

The first test succeeds but the second one fails with the following assertion error:

AssertionError: assert ['', '.. py:m...ed_vars', ...] == ['', '.. py:m...ed_vars', ...]
  At index 62 diff: '   .. py:attribute:: Derived.attr2' != '   .. py:attribute::   Derived.attr7'
  Left contains 31 more items, first extra item: '      :module: target.typed_vars'
  Use -v to get more diff

The reason is that, in the first test, the documenter being used is a ModuleDocumenter for tests/roots/test-ext-autodoc/target/typed_vars.py. In particular, when the documenter documents its members, the annotations of the Alias class is NOT updated using the annotations of the Derived and Class class. By adding

if self.fullname == 'target.typed_vars.Alias.attr2':
    logger.warning([self.fullname, self.parent.__annotations__])
elif self.fullname == 'target.typed_vars.Alias':
    logger.warning([self.object, self.object.__annotations__])

after

# now, import the module and get object to document
if not self.import_object():
return

then the output of the first command is:

# warning:
WARNING: [<class 'target.typed_vars.Derived'>, {'attr7': <class 'int'>}]

However, if we do the same thing when running the second command, we get the following output:

# warning: (this is for test_autoattribute_typed_variable_in_alias)
WARNING: ['target.typed_vars.Alias.attr2', {'attr7': <class 'int'>, 'attr1': 'int', 'attr2': 'int', 'attr3': 'int', 'descr4': 'int', 'attr4': 'int', 'attr5': 'int', 'attr6': 'int'}]

# warning: (this is for test_autodoc_typed_instance_variables)
WARNING: [<class 'target.typed_vars.Derived'>, {'attr7': <class 'int'>, 'attr1': 'int', 'attr2': 'int', 'attr3': 'int', 'descr4': 'int', 'attr4': 'int', 'attr5': 'int', 'attr6': 'int'}]

In other words test_autoattribute_typed_variable_in_alias is polluting the global context of test_autodoc_typed_instance_variables ! The reason is that the AttributeDocumenter used to test test_autoattribute_typed_variable_in_alias fires autodoc events which cause the annotations of the parent object (namely Alias) to be updated. Since Alias is an alias, they will propagate to Derived, and this is why when documenting Derived in test_autodoc_typed_instance_variables by a ClassDocumenter, we end up having more annotations than really needed.

Other side-effects can be seen when simply running this following command:

python -m tox -e py310 -- tests/test_ext_autodoc* 

I think that we need to make sure that each test does not collide with another one and not assume that such test is executed after or before another one. This can be extremely helpful when we simply want to run separate tests (perhaps this can actually be done easily when setting up pytest?).

How to Reproduce

Compare the output of

python -m tox -e py310 \
  -- -s --no-header --no-summary \
  tests/test_ext_autodoc.py::test_autodoc_typed_instance_variables

with the output of

python -m tox -e py310 \
  -- -s --no-header --no-summary \
  tests/test_ext_autodoc_autoattribute.py::test_autoattribute_typed_variable_in_alias \
  tests/test_ext_autodoc.py::test_autodoc_typed_instance_variables

Environment Information

Platform:              linux; (Linux-5.3.18-lp152.106-default-x86_64-with-glibc2.26)
Python version:        3.10.3 (main, Jan 31 2023, 10:47:25) [GCC 7.5.0])
Python implementation: CPython
Sphinx version:        6.2.0+/e1b15a5c1
Docutils version:      0.18.1
Jinja2 version:        3.1.2
Pygments version:      2.14.0

Sphinx extensions

No response

Additional context

No response

marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
I noticed that while running `pytest --random` and the extension
should wipe CNAME file if a configuration changes and it should
not be created.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
I noticed that while running `pytest --random` and the extension
should wipe CNAME file if a configuration changes and it should
not be created.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
I noticed that while running `pytest --random` and the extension
should wipe CNAME file if a configuration changes and it should
not be created.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
Some tests need app.builder.build_all in order to have complete
rebuild.

Related: sphinx-doc#11285.
marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
Some tests need app.builder.build_all in order to have complete
rebuild.

Related: sphinx-doc#11285.
marxin added a commit to marxin/sphinx that referenced this issue Apr 6, 2023
Use `freshenv` argument and remove `outdir` for one tests
as the previous one can leave an unexpected leftover.

Related: sphinx-doc#11285.
@AA-Turner
Copy link
Member

In sphinx.testing.fixtures:

-     kwargs['srcdir'] = srcdir = sphinx_test_tempdir / kwargs.get('srcdir', testroot)
+     kwargs['srcdir'] = srcdir = util.path(tmp_path) / kwargs.get('srcdir', testroot)

using Pytest's tmp_path fixture. This fixes most errors I get when running with --random-order, but comes at a significant cost -- test run time goes from 110 seconds to 405 seconds, a 3.7x slowdown.

A

@marxin
Copy link
Contributor

marxin commented Apr 7, 2023

That's a pretty significant slow-down I would say. Next week, I can finish fixes for --random-order and we can enable both (parallel xdist and --random-order) in CI (if you want?). Note that right now, I can finish tests on my Ryzen 9 machine with the following results:

$ pytest -n auto
...
=== 1850 passed, 2 skipped, 5 warnings in 15.39s ===

@picnixz
Copy link
Member Author

picnixz commented Apr 7, 2023

I am moving my reply to #11293 (comment) here:

perhaps we make any state that is needed explicit via module-level setup or fixtures etc?

I have the following suggestions in that direction:

  • For all tests that have global side-effects on a Python module itself or its contents (e.g., autodoc-related tests), we should be able to reload the documented modules (which are usually lightweight since they are only used for tests). I assume that those modules are essentially without global variables such as cached. It should be sufficient to get a fresh state of those modules. Alternatively, we write the code of the module being documented in some temporary file each time (this could ensure that the state of the module is kept).
  • For IO-related side-effects, I think we need to use a fresh build environment or define a set of files that are expected at the beginning of the test and remove those if they are older than the time when the test is executed. That way, we could keep old files (or directories) that are not needed unless they are specifically expected by the test.
  • If the test still fails with the previous guards, we can explicitly request a fresh execution as if it was entirely isolated but this should be the last resort.

marxin added a commit to marxin/sphinx that referenced this issue Apr 12, 2023
The tests modify source files (index.rst), that is not restored and thus
another test can reach a modified source file and it leads to unexpected
test results.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 12, 2023
The tests modify source files (index.rst), that is not restored and thus
another test can reach a modified source file and it leads to unexpected
test results.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 12, 2023
marxin added a commit to marxin/sphinx that referenced this issue Apr 12, 2023
The test fails with --random due to leftovers in app.outdir location.

Related: sphinx-doc#11285
marxin added a commit to marxin/sphinx that referenced this issue Apr 13, 2023
Related: sphinx-doc#11285
Co-authored-by: Jean-François B <2589111+jfbu@users.noreply.github.com>
@marxin
Copy link
Contributor

marxin commented Apr 13, 2023

Just a few comments about different levels of independence when it comes to tests:

  • pytest -n auto --dist=loadfile - uses python-xdist where loadfile basically means that a separate environment is reused for a file; that's something that achieves a significant speed-up and should work even now
  • pytest -n auto - the default dispatching mechanism feeds workers as they are idle and there are still tests that do fail right now
  • pytest --random-order - runs tests sequentially in a random order and most of the PR I linked to this issue are trying to address the problem; right now, there are quite many problems
  • pytest --random-order --count=X (with X>1) means that each test is run multiple times and it's even harder to make a proper clean-up for some of the tests

I tried to enable python -n auto in GitHub CI but haven't succeeded for some reason:
https://github.com/marxin/sphinx/tree/enable-pytest-xdist

Feel free to comment my thoughts and what are realistic goals we can have?

@picnixz
Copy link
Member Author

picnixz commented Feb 3, 2024

I was trying to fix #10193 and when I wanted to only run the autodoc test suite. HOWEVER, I observed that tests actually FAIL because of side-effects way more than what I expected. I also tried using --random-order and --random-order-bucket=global with pytest-random-order plugin to see whether this is really important and not, and yes, it's non-negligible.

As such, the next I'll be focusing is fixing the test suite. I'll try the following:

  • I'll make it as slow as possible by isolating every single test. Because if a test does not properly work, or if side-effects are actually unexpected for us (like, currently, we might have side-effects that make the test succeed!)
  • Then, I'll try isolating the parts that need to be isolated. Some things don't need to be isolated but most of the autodoc component must be isolated because it's changing the state of modules.

@picnixz
Copy link
Member Author

picnixz commented Feb 10, 2024

So, I found a possible side-effect culprit when implementing #11964. I actually put my modules at the same level as the conf.py file because I didn't want to bother with a package. However ! it appears that it caused some weird side-effect. I verified locally by running all tests in tests/test_extensions vs only those in tests/test_extensions/test_ext_napoleon_docstring.py. The second case runs correctly but one of the test in tests/test_extensions (probably autodoc related) is polluting the global namespace.

I'll continue investigating but I'll first try to fix #11941.

EDIT: I fixed #11941, so I'll start investigating this one tomorrow.

@mansenfranzen
Copy link

I'm facing the same issue that autodoc-related tests succeed if run alone in isolation but fail as soon as multiple tests are run simultaneously. I attempted to debug the issue for several hours with no success.

The error only occurs when multiple tests document the same python module/class. When each test operates on its individual python test data/module, everything is fine.

@picnixz
Copy link
Member Author

picnixz commented Mar 14, 2024

I am working in a reimplementation and I haven't pushed the new changes yet.

I think what I did should be fine so please wait for a few weeks time until I have more time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants