Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port test suite to pytest #321

Merged
merged 8 commits into from
Aug 22, 2016
Merged

Port test suite to pytest #321

merged 8 commits into from
Aug 22, 2016

Conversation

dean0x7d
Copy link
Member

@dean0x7d dean0x7d commented Aug 4, 2016

Updated information

Thanks to PR #324 constructor/destructor testing has become easier, so this PR was modified as well.

Im short, this PR ports all test cases from print output + reference file checks to the pytest framework which uses assert statements and a nice introspection engine for reporting failures.

For tests which still require output capture, this PR extends pytest's builtin capfd output capture function for the specific needs of pybind11. It allows the following kind of test:

with capture:  # output capture starts here
    print_from_cpp()  # function uses `std::cout` internally
# capture ends here
assert capture == """
    This is the expected output.
    Printed by std::cout in C++ or print in Python.
    One more line of text.
"""

In addition, the assert can also use capture.unordered which does not enforce line order.

However, it is preferable to avoid output capture where possible. This PR converts most tests from:

with capture:
    kw_func1(5, y=10)
assert capture == "kw_func(x=5, y=10)"

to simple one-line asserts:

assert kw_func1(5, y=10) == "x=5, y=10"

Original proposal (outdated info)

I realize this is not the best time for this proposal -- the multiple PRs currently in flight are already stepping on each other (causing merge conflicts) and this would be even more disruptive. Nevertheless, I've been tinkering with this idea. I'll leave this here for your consideration, but there's no hurry.

This PR proposes porting the test suite to pytest. This should make for easier development. The nice thing is that most of the tests can be done by directly comparing to values with simple asserts (instead of a reference file) and pytest's failure reports make it easy to debug and pinpoint problems. For the tests that still require output capturing (constructors/destructors), pytest offers more localized pre-line output capturing (more details below). This should make it much easier to navigate the tests since the .py and .ref files are merged into one and the expected values/stdout are right next to the code.

The commit contains a proof of concept: the testing framework is complete, but only some of the tests have been ported so far. For clarity, the existing example dir is untouched and the pytest stuff is in the tests dir. Travis and AppVeyor are already configured to run pytest. To run it locally, just make pytest.

conftest.py is pytest's configuration file. I've extended the builtin capfd output capture function for the specific needs of pybind11. For example:

with capture:  # capture starts here
    m = Matrix(5, 5)
# capture ends here
assert capture.output == "Creating a 5x5 matrix"  # expected output

However, since constructors can't be compared reliably, there are 3 output comparison modes:

  1. capture.output will expect an exact match to the expected string.
  2. capture.relaxed is intended to constructor/destructors and will be disabled on some compilers like the current --relaxed flag.
  3. capture.unordered does not enforce line order which is intended for dict and set random order output.

There is also a custom doc fixture which retrieves and sanitizes docstrings.

Downsides: This adds pytest as dev time dependency, but it should be easy enough to install with just pip install pytest.

@jagerman
Copy link
Member

jagerman commented Aug 7, 2016

I'll chime in here with a couple comments.

Regarding pytest, that sounds good for many of the tests. eigen.py/cpp (which you converted) comes to mind: it mostly consists of output of a bunch of lines such as "test_whatever() OK" or "test_whatever() FAILED"--which is basically already just a test script, but with less sophistication than pytest would give us. On the other hand, for many of the example scripts in example/*.py the python code is, conceptually, part of the example, where providing the python-side counterpart to something in pybind11 is useful. With the current example/whatever.{cpp,py} structure it's obvious where the associated python code is, but it's a little less transparent with the python test code. Perhaps just adding something like "see tests/whatever.py for python code testing this example" to the .cpp headers would address this?

Regarding the constructor/destructor output, I think the current tracking-constructors-via-print statements is worthwhile getting rid of in favour of an object that tracks matching construction/destruction pairs. See PR #324 for code that does exactly this (and gets rid of the need for relaxed mode entirely).

@dean0x7d
Copy link
Member Author

dean0x7d commented Aug 8, 2016

I forgot to mention this in the original post, but the example/tests split which exists in the current commit is mainly because of the WIP status. I expect that I'll need to rebase this a few times because other PRs will modify the example files. Since this is a big change, if I modified the example files directly in parallel, it would lead to some nasty merge conflicts. My approach here is I) keep the example dir untouched, II) work on the new stuff entirely inside the tests dir, III) put it all back together in the end.

Now the question is where to put it all back together. I can see either:

  1. Keep the current example/example-* structure and just move the pytest contents into the example-*.py files (pytest can be configured to accept any file prefix for test discovery).
  2. Move everything into the tests dir and rename the .cpp files to match pytest's usual test_* format.

I'm in favor of the second since it seems to be a more accurate name. Some of the current examples do very detailed testing including implementation details which are not that interesting for someone just looking for examples. In general, a good example is not necessarily a good test, and vice versa. Perhaps it would be nice to have the tests dir be dedicated just for thorough testing, and have a significantly scaled down examples dir which contains only a small number of very short but illustrative examples.

Regarding constructor/destructor tracking, I'll comment in the other PR.

@wjakob
Copy link
Member

wjakob commented Aug 8, 2016

This looks like a nice cleanup (especially if combined with the constructor/destructor tracking from the other PR). Note that I would prefer it if the tests and the .cpp code stay in the same directory (the code in there is matched 1:1, so it seems weird to me to separate them into different directories).

@dean0x7d
Copy link
Member Author

dean0x7d commented Aug 8, 2016

The separate tests and .cpp files are only temporary to prevent merge conflicts during the development of this PR. See my last post. Do you have any preference for the final destination: put everything back into example or make them officially tests?

@wjakob
Copy link
Member

wjakob commented Aug 8, 2016

Got it -- renaming to tests would be fine.

@dean0x7d
Copy link
Member Author

OK, I've ported all the tests. This is probably difficult to review due to the size, so I made the changes in 2 stages:

  1. The first commit ports all the Python code to pytest, but does not touch the C++ side.

  2. Then the C++ code is modified to avoid output capture where possible. For example, void functions which print values are modified to return std::string instead. As a result, this kind of test:

    with capture:
        kw_func1(5, y=10)
    assert capture == "kw_func(x=5, y=10)"

    can be replaced with a simple:

    assert kw_func1(5, y=10) == "x=5, y=10"

@dean0x7d
Copy link
Member Author

The last two commits just organize things a little.

  • The inheritance .cpp/.py files were mismatched (even before porting to pytest).
  • There are more enum tests than 'constants and functions', so it seems proper to have a separate file for them.

@dean0x7d dean0x7d changed the title WIP proposal: Port test suite to pytest Port test suite to pytest Aug 12, 2016
@dean0x7d
Copy link
Member Author

Updated the documentation. Rather than explain how to install pytest, I made CMake do it automatically.

Note that the first commit fixes some pre-existing sphinx warnings: the code block in this section are not currently displayed.

@dean0x7d
Copy link
Member Author

One of the AppVeyor builds timed out without even starting. Looks like an issue on their end and the build just needs to be restarted.

@aldanor
Copy link
Member

aldanor commented Aug 13, 2016

I'd like to add, it may make sense to go over the tests and see where the output / the ref file is completely redundant (maybe in 90% of them it is).

It's much better to do things like

assert f() == 'foobar'

(and in case you use pytest, it will use its assertions engine to give you meaningful diffs if something's off)

than

print(f())  # match in .ref file

@dean0x7d
Copy link
Member Author

@aldanor Yes, definitely. A lot of those cases are handled just by converting to pytest (print -> assert) and commit 9ad37fd simplifies even more tests by avoiding std::cout as well.

@dean0x7d
Copy link
Member Author

Rebased onto latest master, i.e. ported tests from #308 and #333.

@@ -102,7 +102,7 @@ C++ side, or to perform other types of customization.

.. seealso::

The file :file:`example/example-operator-overloading.cpp` contains a
The file :file:`tests/test_operator-overloading.cpp` contains a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but wouldn't it be a bit cleaner to use one convention for test filenames, dashes or underscore? I.e., test_operator_overloading.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll rename the files on the next rebase.

@aldanor
Copy link
Member

aldanor commented Aug 14, 2016

Thanks, looks pretty good, modulo a few minor comments.

One other thought I had regarding the test suite, would it make sense to actually build multiple Python test modules and not just one big lump of stuff (i.e. pybind11_tests). I've already run into name clashes previously and it's only going to get worse as the number of tests grow.

@aldanor
Copy link
Member

aldanor commented Aug 15, 2016

Just to add to the previous comment re: splitting into multiple modules: aside from cosmetic reasons and avoiding name clashes, this would also be helpful during development when the code in module initialization can fail and the whole test suite then collapses with all tests throwing ImportError (e.g. this happened to me many times while working on numpy-dtypes test module since it does some heavy lifting in module initialization).

@dean0x7d
Copy link
Member Author

dean0x7d commented Aug 15, 2016

New commits address some of the comments -- I'll squash these with the next rebase if the changes look good.

Regarding splitting up the tests, it would be possible to make separate submodules per test as is currently the case for test_module and test_issues. This would also make importing easier, i.e. just import a submodule per test file instead of importing individual items from the root module.

@aldanor
Copy link
Member

aldanor commented Aug 15, 2016

Splitting into submodules seems like one way to go. Would it avoid import errors in all submodules if a single submodule fails in module initialization?

def test_eval(capture):
from pybind11_tests import example_eval

with capture:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this test, I think all testing logic should be moved to Python and capture/std::cout eradicated; instead of doing the actual testing, the cpp file could just expose a few functions like

.def("eval_single_statement", 
    [&](std::string s) { py::eval<py::eval_single_statement>(s, main_namespace); })
.def("eval_expr",
    [&](std::string s) { return (py::object) py::eval(s, main_namespace); })
.def("eval_file",
    [&](std::string s) { return (py::object) py::eval_file(s, main_namespace); })

so they can be called from Python, making use of pytest assertions etc.

Use simple asserts and pytest's powerful introspection to make testing
simpler. This merges the old .py/.ref file pairs into simple .py files
where the expected values are right next to the code being tested.

This commit does not touch the C++ part of the code and replicates the
Python tests exactly like the old .ref-file-based approach.
The C++ part of the test code is modified to achieve this. As a result,
this kind of test:

```python
with capture:
    kw_func1(5, y=10)
assert capture == "kw_func(x=5, y=10)"
```

can be replaced with a simple:

`assert kw_func1(5, y=10) == "x=5, y=10"`
There are more enum tests than 'constants and functions'.
Pytest is a development dependency but we can make it painless by
automating the install using CMake.
Test compilation instructions for Windows were changed to use the
`cmake --build` command line invocation which should be easier than
manually setting up using the CMake GUI and Visual Studio.
Most of the test code is left in C++ since this is the
intended use case for the eval functions.
@dean0x7d
Copy link
Member Author

dean0x7d commented Aug 19, 2016

Rebased onto master and ported new tests from #343 and 8de0437.

Other changes:

  • All test files now have consistent snake_case names.
  • Simplified more tests by removing print/capture.

There are still some multiline print/capture segments (e.g. in test_numpy_vectorize) but I don't think those are worth replacing, since it would only complicate matters rather than simplify (and there are no ordering/compiler issues like there were for constructor tracking).

Splitting into submodules seems like one way to go. Would it avoid import errors in all submodules if a single submodule fails in module initialization?

Unfortunately, it looks like all binary submodules are initialized at the same time, so this wouldn't solve the import problem. Splitting into actual modules may be the only way to go, but I'm afraid this would lead to a lot of code duplication. I'll look into it.

@wjakob
Copy link
Member

wjakob commented Aug 19, 2016

Just a quick comment: I think that working with one binary is fine for now. If a change causes the whole test suite to come down crashing then that is useful feedback as well (plus, instantiating the whole module rather than just a few tiny parts is a nice test of reference counting correctness for types and functions)

@dean0x7d
Copy link
Member Author

Sure, makes sense. Simple submodules can be used just to avoid name clashes in future tests. I took a brief look at maybe splitting the existing tests into submodules, but it wouldn't really add any value at this point -- future tests can easily just add submodules as needed.

@wjakob
Copy link
Member

wjakob commented Aug 22, 2016

This is fantastic 🎉 -- merged!

@wjakob wjakob merged commit faec30c into pybind:master Aug 22, 2016
@dean0x7d dean0x7d deleted the pytest branch August 23, 2016 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants