Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong coverage when using SQLAlchemy+asyncpg+strawberry-graphql #1216

Open
Apakottur opened this issue Aug 24, 2021 · 7 comments
Open

Wrong coverage when using SQLAlchemy+asyncpg+strawberry-graphql #1216

Apakottur opened this issue Aug 24, 2021 · 7 comments
Labels
bug Something isn't working exotic Unusual execution environment

Comments

@Apakottur
Copy link

Apakottur commented Aug 24, 2021

Describe the bug

We are using the (awesome!!) coverage script in our projects and recently we've been getting some incorrect results.

The coverage script shows some code as uncovered even though it's being executed.

To Reproduce
The problem only seems to occur when multiple complex libraries are used together. Here's a minimal example that reproduces the problem:

https://gist.github.com/Apakottur/cc0b0ae6bd982dd1c2f48880ce8d9749

We use docker to fix the package versions, so to run the example one needs docker and docker-compose installed.

With all these files in the same directory, running docker-compose up --build test gives us:

test_1           | This is not covered
test_1           | Name         Stmts   Miss  Cover   Missing
test_1           | ------------------------------------------
test_1           | example.py      20      2    90%   18-19
test_1           | ------------------------------------------
test_1           | TOTAL           20      2    90%

Which is clearly contradictory, as lines 29-30 contain the printout This is not covered, which is printed.

How would we go and debug this further? Is there a way to step through the coverage script and see why it's not considering these lines?

@Apakottur Apakottur added the bug Something isn't working label Aug 24, 2021
@nedbat
Copy link
Owner

nedbat commented Aug 24, 2021

Thanks for the reproducible case :)

I see the same results you do, and I don't understand them. But if I replace await db.commit() with await asyncio.sleep(1), the problem goes away. So it's not something generic about await. Other ORMs have strange execution semantics that are not apparent at first. Does strawberry or sqlalchemy do something very unusual that could interfere?

@Apakottur
Copy link
Author

Thanks for the reply!

I've tried the await asyncio.sleep(1) trick too, and got the same results.

I've also tried reproducting this with only strawberry or only SQLAlchemy, to no avail. They both fiddle with Python internals, hard for me to say what exactly might cause the problem.
There's also no issue if we use the synchronous SQLAlchemy driver, so this is definetly something between all these libraries being used together.

@nedbat nedbat added the exotic Unusual execution environment label Aug 24, 2021
@Apakottur
Copy link
Author

I've updated the gist and simplified the example code even more - it's enough to call SELECT 1 on the DB, no need to have any DB models.

@Apakottur
Copy link
Author

@nedbat I've decided to look a bit into this again and found that making the following change in the Python tracer fixes this specific issue, but I am not sure why. The only change is the addition of the line which "syncs" self. cur_file_data:

# Original
...
elif event == 'line':
    # Record an executed line.
    if self.cur_file_data is not None:
        lineno = frame.f_lineno

        if self.trace_arcs:
            self.cur_file_data.add((self.last_line, lineno))
        else:
            self.cur_file_data.add(lineno)
        self.last_line = lineno
...
# Modified
...
elif event == 'line':
    # Record an executed line.
    self. cur_file_data = self.data.get(frame.f_code.co_filename, None) # This is the new line.
    if self.cur_file_data is not None:
        lineno = frame.f_lineno

        if self.trace_arcs:
            self.cur_file_data.add((self.last_line, lineno))
        else:
            self.cur_file_data.add(lineno)
        self.last_line = lineno
...

Seems like only three tests fail following this change.

I think that maybe one of the libraries leads to self. cur_file_data getting curropted, which leads to incorrect results.
Can I ask why do we need to store it in the data stack in the first place? Can't we always deduce it via self.data.get(frame.f_code.co_filename, None)?

@klaasjanelzinga
Copy link

@Apakottur I stumbled on this as well. SQLAlchemy uses greenlet under the asyncio hood. You need to instruct coverage to take this into account, see #841 (comment)

@Apakottur
Copy link
Author

@klaasjanelzinga Super nice find!
Adding concurrency=greenlet fixes the issue in this case as well.

This seems so obvious in retrospect, I just never got to do a bit more trace investigation.

@nick-benyo
Copy link

Hey + 1 to Apakottur. If you're using FastAPI, make sure to include thread as well. Coverage has a hard time keeping track of both without explicitly declaring. https://stackoverflow.com/questions/74305054/pytest-cov-does-not-cover-all-lines-when-using-sqlalchemy-async-session

For a pyproject definition:

[tool.coverage.run]
concurrency = [
"greenlet",
"thread"
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working exotic Unusual execution environment
Projects
None yet
Development

No branches or pull requests

4 participants