Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new "force exclude" option #12373

Closed
wants to merge 1 commit into from

Conversation

JohnHBrock
Copy link

Description

This PR adds a --force-exclude CLI option and corresponding config option that behaves identically to --exclude, except files and modules passed explicitly to mypy will still be ignored. This addresses problems people have had where individual filepaths are passed to mypy (typically by some tool like pre-commit or VSCode) and those individual filepaths unexpectedly get processed despite matching the --exclude regex, e.g., issues #11094 and #11760.

The exact case that motivated this PR was a mypy pre-commit hook scanning .py files generated by the protobuf compiler, even though I had exclude = _pb2\.py$ in .mypy.ini.

Test Plan

I added a new unit test at test_find_sources.py::test_find_sources_force_exclude. This is essentially the same as the pre-existing test test_find_sources_exclude; I simply removed the default behavior checks at the top and modified the remaining logic to account for the handling of individual filepaths.

I also rebuilt the docs and found no markup errors after manual inspection. Nor did I find problems after running make linkcheck.

@github-actions
Copy link
Contributor

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Mar 17, 2022

Thank you! I'm not sure I'm in favour of this. People are already confused enough about the behaviour of exclude (e.g., expecting it to do something like a per-module follow imports skip) that I'm hesitant to add something that sounds forceful but is almost the same and primarily works around another tool's behaviour.

Since mypy will follow all imports, in most cases users are not actually benefitting from pre-commit passing the files in git diff. I recommend setting the pre-commit flag pass_filenames to false and making your command just be mypy . (or whatever). This also has the advantage of working today.

@JohnHBrock
Copy link
Author

Maybe --more-intuitive-exclude would be a better name? :)

When used manually, I can see why it might make some sense for mypy some_file.py to check some_file.py even when exclude = \.py$ is set; after all, the user requested some_file.py explicitly for a reason, presumably! Unfortunately, this behavior that makes some sense in a manual context leads to unintuitive behavior in an automated context, as evidenced by all the github issues people have created about this.

In a perfect world, I think --exclude would behave like --force-exclude, i.e., if a filepath were explicitly supplied that matched the exclude regex, then mypy would ignore the file (and maybe print that the exclude regex matched the file, in addition to Nothing to do?!). Then you could also have a --force flag that would make mypy ignore the use of --exclude and process the file anyway (as it does today). In this context, adding --force-exclude would just be a stopgap to avoid breaking backwards compatibility until --exclude's behavior could be safely changed and a --force flag could be added.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Mar 18, 2022

When used manually, I can see why it might make some sense

What you call using "manually" is not some niche case. mypy is a standalone tool, not a pre-commit plugin.

In my previous comment, I proposed a pre-commit solution to your pre-commit problem that you seem to have ignored: use pass_filenames: false. I'd be happy to add this to mypy documentation somewhere, since it seems like it wasn't mentioned in the two issues you link.

Additional config options have a real usability cost, especially when they're very similar to existing config options.

@JohnHBrock
Copy link
Author

I didn't claim using mypy manually was a niche case, it definitely isn't! But neither is using mypy with VSCode, pre-commit, or other tools (and I also don't think mypy development should be beholden to these other tools). And I didn't mention your solution in my comment, but I promise I didn't ignore it, it's a good solution! I could also make a PR to add that to the docs if you'd like, up to you.

However, the point of my PR isn't just to solve my specific problem, it's to address what I think is a general usability problem with how --exclude works, regardless of my specific issue. This particular behavior is counterintuitive in many cases and could be improved.

I'm not saying my PR is the best or only way to do that, but I think the long-term direction I outlined in my previous comment (introducing --force-exclude until some point in the future when the behavior of --exclude can be modified and a --force flag can be introduced) makes a lot of sense.

I hear you on the usability cost (not to mention development & testing costs) of adding a new config option. I think this cost is acceptable if --force-exclude is only a stopgap measure until --exclude can be modified, but if that's a total non-starter, then I can close this PR.

Let me know what you think, I appreciate the feedback.

@mje-nz
Copy link

mje-nz commented Mar 28, 2022

I came here looking for exactly this feature for exactly this reason. For what it's worth, black has --force-exclude with these semantics.

@JohnHBrock
Copy link
Author

Ah, that's right. I forgot because I started this PR a while ago and then neglected it, but I stole the --force-exclude name and semantics from black.

@hmc-cs-mdrissi
Copy link
Contributor

mypy . solution has one main issue. It applies mypy to all files. For large enough repositories that can be a minute+ command which makes git commits feel much slower and slow CI/commit hooks is one common concern I've heard from teammates as I mostly manage CI. So when used with pre-commit I'd be very avoidant of mypy . for any large repository. My repository is ~100k lines. I'm sure there are much larger python repositories then that were speed would be a stronger blocker.

There are other good workarounds though. The easiest one is pre-commit supports an exclude list of its own. I think using pre-commit exclude is my recommendation. It's minor duplication of excludes but excludes are normally just one/two lines long so I think that's easier than new option. If you really want to avoid duplication you can also make a python script that constructs the file list to mypy and calls mypy with any logic you like.

If we do end up with a new exclude, using pyright as a comparison, pyright has exclude and ignore as separate. ignore is not same as force-exclude but I think would capture goal here. ignore suppresses all error messages for a while regardless of whether it is scanned. Ignore only suppresses error messages and does not affect which files are processed. exclude controls what modules are scanned similar to mypy.

@KotlinIsland
Copy link
Contributor

KotlinIsland commented Apr 15, 2023

the issue with passing a subset of files to mypy is that there is a chance that mypy will then not check a file that will now contain errors:

# a.py
a = 1
# b.py
from a import a
print(a + 1)

If we were to update a.py and change the type of a from an int to a str, then pre-commit mypy will only check a.py and will not report the error within b.py.

@ikonst
Copy link
Contributor

ikonst commented Apr 16, 2023

the issue with passing a subset of files to mypy is that there is a chance that mypy will then not check a file that will now contain errors:

In CI you'd run pre-commit with --all-files, so running only for the changed files locally is a reasonable "risk" to take given it's faster.

Btw, a similar ticket I opened for semgrep, and semgrep is much faster going over our 100k+ LOC repos:
semgrep/semgrep#6494

@hauntsaninja
Copy link
Collaborator

I think pass_filenames: false is the way to go here.
If you have pass_filenames: true and it saves you work, your mypy incantation is buggy. If you really know what you're doing, have multiple mypy hooks.

In general, I think there are some pitfalls that make mypy not a natural fit for pre-commit. See also #13916

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants