Skip to content
This repository was archived by the owner on Apr 14, 2022. It is now read-only.

Running unasync on test files #198

Closed
pquentin opened this issue Jan 27, 2020 · 8 comments
Closed

Running unasync on test files #198

pquentin opened this issue Jan 27, 2020 · 8 comments

Comments

@pquentin
Copy link
Member

One of Hip's main goals is to support both asynchronous I/O (asyncio, trio...) and synchronous I/O. We achieve this by running unasync to generate synchronous code from asynchronous code and writing various I/O backends, which allows us to avoid any duplication between async and sync code. See #149 for more details.

We know how to test asynchronous code (see https://github.com/python-trio/hip/blob/master/test/with_dummyserver/async/test_poolmanager.py) and sync code (by using the urllib3 tests), but we don't know how to keep a single copy of the test files yet, apart from some vague plan of using unasync in tests too.

This issue is about figuring this out: I started thinking about it since we're about to modify unasync to cope with the new hip/ahip naming.

The idea is to generate synchronous tests, ship them as part of the source distribution, install that and run pytest on the installed tests. This new process won't be more annoying because we already need to install the sdist to run tests anyway.

File names

We want to transform hip/test/with_dummyserver/$ASYNC/test_poolmanager.py into hip/test/with_dummyserver/$SYNC/test_poolmanager.py. pytest allows using the same file name (test_poolmanager.py here) as long as we add __init__.py files in the appropriate places.

We need to choose the $ASYNC and $SYNC names. Since async becomes a keyword in Python 3.7, async and sync will only work if we don't need to import those test files. We don't need to import them right now but that's how the PyOpenSSL and SecureTransport tests work in urllib3, so maybe just sticking to _async and _sync is more future-proof.

File contents

The async test file is going to look like this after the hip/ahip rename:

from ahip import PoolManager

from test.with_dummyserver import conftest


class TestPoolManager:
    @conftest.async_backends
    async def test_redirect(self, http_server, backend):
        base_url = "http://{}:{}".format(http_server.host, http_server.port)
        with PoolManager(backend=backend) as http:
            r = await http.request(
                "GET",
                "%s/redirect" % base_url,
                fields={"target": "%s/" % base_url},
                redirect=False,
            )
            assert r.status == 303

            r = await http.request(
                "GET", "%s/redirect" % base_url, fields={"target": "%s/" % base_url}
            )

            assert r.status == 200
            assert r.data == b"Dummy server!"

And I think the sync test file should look like this:

from hip import PoolManager

from test.with_dummyserver import conftest


class TestPoolManager:
    @conftest.sync_backend
    def test_redirect(self, http_server, backend):
        base_url = "http://{}:{}".format(http_server.host, http_server.port)
        with PoolManager(backend=backend) as http:
            r = http.request(
                "GET",
                "%s/redirect" % base_url,
                fields={"target": "%s/" % base_url},
                redirect=False,
            )
            assert r.status == 303

            r = http.request(
                "GET", "%s/redirect" % base_url, fields={"target": "%s/" % base_url}
            )

            assert r.status == 200
            assert r.data == b"Dummy server!"

We need two changes here: ahip -> hip and async_backends -> sync_backend. We can specify those in the setup.py file or we can hardcode that in unasync, because hip is just that special :-).

The decisions we need to make:

  • Does that make sense?
  • What directory names? _async -> _sync?
  • Should we hardcode ahip -> hip and async_backend -> sync_backend or allow specifying it?
@sethmlarson
Copy link
Contributor

sethmlarson commented Jan 27, 2020

Filenames can be tests_async/ -> tests_sync/, there's no reason to have the _ prefix since they're test-cases.

I was thinking this is a good use-case for unasync.customize_build_py() to allow for additional rewrite rules like ahip -> hip and async_backend -> sync_backend, specifically in tests/? And then not having those rules within src/.

@pquentin
Copy link
Member Author

pquentin commented Jan 29, 2020

Filenames can be tests_async/ -> tests_sync/, there's no reason to have the _ prefix since they're test-cases.

Do you mean hip/test/with_dummyserver/test_async/test_poolmanager.py or hip/tests_async/with_dummyserver/test_poolmanager.py? Or something else?

And yes, I plan to use customize_build_by for this!

@sethmlarson
Copy link
Contributor

If we end up putting everything under the source root we wouldn't have an issue with test_async versus test_sync, the test suite would just be duplicated automatically under ahip/tests/ and hip/tests/.

My above comment was more if we decided to go with tests/ not under the source root and end up with tests/test_async and tests/test_sync.

@pquentin
Copy link
Member Author

Something I tend to forget is that there are parts of the codebase that don't do I/O at all, like the URL handling. I think they can stay under "hip". Anyway, here's what our current hierarchy looks like:

├── hip
│   ├── _async
│   │   └── connection.py
│   ├── _sync  # generated
│   │   └── connection.py  # generated
│   └── url.py
└── test
    ├── test_connection.py
    └── test_url.py

Under the new scheme our hierarchy would look like:

├── ahip
│   └── connection.py
├── hip
│   ├── connection.py  # generated
│   └── url.py
└── test
    ├── _async
    │   └── test_connection.py
    ├── _sync  # generated
    │   └── test_connection.py  # generated
    └── test_url.py

And as you mentioned the second option (tests under the source root) would look like:

├── ahip
│   ├── connection.py
│   └── tests
│       └── test_connection.py
└── hip
    ├── connection.py  # generated
    ├── tests
    │   ├── test_connection.py  # generated
    │   └── test_url.py
    └── url.py

I guess this is better because it's a natural way to ship the tests along with the code. I'm not too happy that in both cases there's generated code alongside normal code but it's probably not a big problem in practice.

@sethmlarson
Copy link
Contributor

@pquentin I was assuming that all "sync" code wouldn't exist before running the unasync build step but additionally "non-I/O" code like URLs, config, etc would be under both ahip and hip. The two would be exact copies of each-other from an API perspective. It's not too painful to have the duplication of non-I/O code because we're only writing it once anyways (on the async side, "generated" into the sync side)

In my mind the directory structure would look like this:

├── ahip
│   ├── connection.py
│   ├── url.py
│   └── tests
│       ├── test_connection.py
│       └── test_url.py
└── hip  # all of this is generated
    ├── connection.py
    ├── url.py
    └── tests
        ├── test_connection.py
        └── test_url.py

Then we can have an unasync configuration like the following:

from setuptools import setup
import unasync

setup(... unasync.customize_build_py(
    rules = [
        Rule(from_pattern="ahip", to_pattern="hip"),
        Rule(from_pattern="ahip/tests", to_pattern="hip/tests", additional_replacements={"ahip": "hip"}
    ]
))

Some of the pros I can think of:

  • Tests are automatically packaged with our code. (May be a negative for some users that dislike tests packaged with code).
  • We are guaranteed to test both the ahip and hip versions of all code
  • API is symmetric, users don't have to think about I/O vs. non-I/O
  • unasync will have to support this kind of configuration eventually, especially if we want a broader audience. What are frameworks going to need when they want to use both ahip and hip for their projects?

Some cons:

  • Every test is x2, even if it's maybe not necessary to be?
  • Larger installs, tests are a significant portion of the size of code we ship.

@njsmith
Copy link
Member

njsmith commented Jan 30, 2020

Yeah, I don't think it's worth trying to convince setuptools to do funky special stuff just for the tests. Practically speaking, for day-to-day development, we're going to have a helper script like ./runtests.py that knows how to automatically regenerate the source and tests, run the tests, etc., because we don't want to do that manually. And if we have that, then... what else do we need? Putting those smarts into setup.py specifically doesn't really add anything.

(Putting the smarts for building the library itself into setup.py does make sense, because that's needed to make pip install work. But there's no pip test, so that argument doesn't apply to the tests.)

So I think there's two reasonable options, and they're both workable, so it's really just up to our preference:

  1. We could move all the tests and code into a single ahip package, like in Seth's comment, and generate a hip package alongside it. That way the tests get automatically handled by the same code that handles the main package. ./runtests.py will run unasync to generate the hip package, and then invoke pytest to run the tests in both packages.

  2. We could move all the main project code into a single ahip package, but keep the tests in a separate tests/ directory. ./runtests.py would run unasync to generate the hip package and also to generate the tests_sync/ directory, and then invoke pytest to run both sets of tests with some appropriate PYTHONPATH.

The upsides of the first approach: Slightly simpler. Makes it easier for end-users to run tests, if they feel the need to do so for some reason. Gives us the option to drop the src/ directory, if we want to simplify our directory layout a bit.

Downsides of the first approach: Folks will complain about shipping extra megabytes of tests in their docker images. (urllib3's tests, including .pyc files, are ~1 megabyte, it looks like? So the actual cost is ~2 megabytes per image.) If we're really going to be as widely used as urllib3 then this will inevitably be a common complaint.

I'd be fine with either approach really. The advantages/disadvantages are pretty minor and fairly balanced.

@sethmlarson
Copy link
Contributor

I'm thinking best if we go the route of tests outside the source directory as I feel that'll be the common use-case for other projects? If so I can try tackling this.

@sethmlarson
Copy link
Contributor

I'd say the route of #208 seems great and is where we're headed, going to close this. Reopen if there are still unanswered questions!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants