Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A story: Getting started with mypy on a large existing codebase #2096

Closed
davidfstr opened this issue Sep 5, 2016 · 9 comments
Closed

A story: Getting started with mypy on a large existing codebase #2096

davidfstr opened this issue Sep 5, 2016 · 9 comments

Comments

@davidfstr
Copy link
Contributor

davidfstr commented Sep 5, 2016

This evening I decided to try out mypy on a large existing Django codebase. Hundreds of .py files.

Surprisingly, I found no actual tutorial on the documentation on how to use mypy on a large existing project as opposed to single file. Therefore I will enumerate a number of issues that I have encountered so far. Perhaps such an enumeration will be useful in writing new documentation, fixing issues, or other improvements.

  • No documented way to run mypy on anything except single files.

I tried invoking mypy on a directory rather than a single .py file. This appears to work, however the documentation does not appear to mention this usage as a possiblity.

  • error: Cannot find module named 'FOO.BAR.BAZ'

I receive the following error when there certainly exists a file ./FOO/BAR/BAZ.py. Therefore this error makes no sense to me. I invoked mypy with MYPYPATH=. mypy ..

  • error: Need type annotation for variable

Why? This points to a line that looks like dependencies = []. Error message should be improved to indicate why it is demanding a type annotation.

  • error: Method must have at least one argument

Well done. This was flagging a number of unittest methods marked with @skip that took no arguments. Such as the following:

class BeaconBasicTests(SimpleTestCase):
    @skip('not yet automated')
    def test_given_beacons_exist_when_page_reloaded_then_beacons_still_exist():
        pass

It was expecting (self) as the signature, which seems reasonable.

  • error: Result type of + incompatible in assignment

Appears to be a mypy bug. Was complaining about the following construction:

INSTALLED_APPS = (
    'django.contrib.admin',
    ...
)
if USE_DEBUG_TOOLBAR:
    INSTALLED_APPS += ('debug_toolbar',)  # error here
  • error: Value of type "object" is not indexable

Appears to be a mypy bug. Was complaining about the following construction:

LESSON_OBJECT_TYPES = {
    'warm_up': dict(
        title='Warm Up',
        ...=...,
    ),
    ...
)
x = LESSON_OBJECT_TYPES['teacher_coding_exercise']['expected_file_types']  # error here
  • error: Callable[[Any, Any, Any], Any] has no attribute "short_description"

Appears to be a mypy bug. Was complaining about the following construction:

def upload_deck(modeladmin, request, queryset):
    ...

upload_deck.short_description = 'Upload slides to selected deck'  # error here

It is legal (albeit advanced) to annotate a function object (including a lambda) with additional attributes.

  • error: Name 'Foo' already defined

Common to receive this kind of error when using star-imports. For example:

=== A.py
from B import *
import C

=== B.py
import C

In the above scenario mypy will complain that C is already defined in A.

Recommend altering there to be an error only if Foo is defined more than once and is defined to be something different.

That's all folks. I suppose I'll wait to add my own type annotations until mypy gives my current unannotated code a clean bill of health.

@refi64
Copy link
Contributor

refi64 commented Sep 5, 2016

No documented way to run mypy on anything except single files.

http://mypy.readthedocs.io/en/latest/command_line.html#specifying-files-and-directories-to-be-checked

error: Need type annotation for variable

http://mypy.readthedocs.io/en/latest/common_issues.html#types-of-empty-collections

error: Value of type "object" is not indexable

#1786

error: Callable[[Any, Any, Any], Any] has no attribute "short_description"

#708

error: Name 'Foo' already defined

Try putting #type: ignore on the second import.

@davidfstr
Copy link
Contributor Author

Thank you for taking the time to respond and provide references. Particularly on a holiday.

No documented way to run mypy on anything except single files.

http://mypy.readthedocs.io/en/latest/command_line.html#specifying-files-and-directories-to-be-checked

That topic ("The mypy command line") is at position 15/18 in the reference documentation and I did not see it before. Under the assumption that running the mypy tool would be a topic of primary interest when reading the mypy documentation, it might make more sense to feature this topic earlier in the documentation to increase visibility.

error: Need type annotation for variable

http://mypy.readthedocs.io/en/latest/common_issues.html#types-of-empty-collections

Although it may not be possible to deduce the specific type of an empty list literal beyond list[Any], does it make sense to issue an error here? It seems like list[Any] could simply be assumed.

#1786: Do note that mypy currently doesn't support different (fixed) keys of a dict having different types -- that's probably what's biting you. That and the empty dicts.

Interesting. Is this limitation of dicts intentional or something that is candidate to amend in the future? (That is, if I or someone were to submit a patch to support such dicts, would it be rejected for some philosophical reason?)

This leads to a separate philosophical question as to what kinds of things the type checker is intended to flag. For example is it the goal that the type checker not issue any errors when run on an unannotated program that has 100% code coverage? Another way of phrasing the question: should the type checker accept all programs that execute properly at runtime even if they're doing something "weird" in the opinion of the type system designer?

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 6, 2016

Although it may not be possible to deduce the specific type of an empty list literal beyond list[Any], does it make sense to issue an error here? It seems like list[Any] could simply be assumed.

This is something we've been considering for a long time. The reason why mypy doesn't do this right now is that then it would be easy to accidentally leave partially unchecked parts of code. Besides, annotating these variables usually isn't a lot of work.

We've been thinking about a "strict mode" that would require all functions to be fully annotated, among other things. It could enforce that no implicit List[Any] types would be inferred for variables, but we could infer these types in the default mode to make mypy adoption easier.

Interesting. Is this limitation of dicts intentional or something that is candidate to amend in the future? (That is, if I or someone were to submit a patch to support such dicts, would it be rejected for some philosophical reason?)

I've thought about this a bit, and this would be a pretty complex feature to implement. It would likely require changes to PEP 484 and/or the typing module, unless we decide to make it a mypy-only feature. We'd be happy to discuss the specifics of such as feature (there is an existing issue in the typing tracker: python/typing#28), but as it's potentially going to be a big change with a lot of corner cases that are difficult to get right, we might not want to receive such a big PR in the near future, as it would require somebody from the core team to spend a lot of time reviewing the code. But say 6 months from now, things could be quite different and we might be thrilled to receive a PR, but again only once we've discussed how it should work in general.

For example is it the goal that the type checker not issue any errors when run on an unannotated program that has 100% code coverage?

Well that would be nice, but it's not likely to happen, due to various technical difficulties. Basically Python is such a dynamic language that a static checker has to make some simplifying assumptions about how programs behave, and sometimes these assumptions are not right.

The actual goal is to keep the number of errors low (enough) for most Python code and to generally make it easy to work around the errors.

The way I think about this is actually pretty simple: adopting static typing will take some work anyway, since writing function annotations takes some effort, and without function annotations mypy won't be terribly useful. We try to reduce the total amount of work needed to annotate a program, and fixing useless errors generated when checking unannotated programs is just one thing among many that we are working on to improve, and with limited resources we have to prioritize tasks.

@rowillia
Copy link
Contributor

rowillia commented Sep 6, 2016

@davidfstr I'm going through a similar exercise now, one example I've found to be super useful is the https://github.com/zulip/zulip codebase. It's mostly annotated with PEP 484 annotations so it's a great real world example and they've got a useful script for invoking MyPy with a Blacklist - https://github.com/zulip/zulip/blob/master/tools/run-mypy

@gnprice @JukkaL Would it be possible to get the Zulip folks at Dropbox to do a blog post or a doc on their adoption of MyPy? I'd love to have more insight into how the transition went.

@davidfstr
Copy link
Contributor Author

Thanks for the insights.

I'll close this thread for now, until I have more time to apply mypy against my Django codebase again.

@davidfstr
Copy link
Contributor Author

davidfstr commented Sep 15, 2016

I managed to minimally annotate all existing code so that mypy gave no errors.

The kinds of issues I needed to fix were:

  • Annotated many empty list literals.
  • Annotated many None-assignments.
  • Annotated many heterogeneous dictionary types.
  • Suppress typechecking on star-imports.
  • Suppress typechecking when annotating function objects with attributes.
  • Fix some syntax issues in preexisting annotations.

All of these error classes I've mentioned earlier in this thread.

Edit 1: Reflect that not all empty list literals needed annotation.
Edit 2: Reflect that suppressing type-checking on star-imports is no longer necessary, with the fix of #2135 in mypy 0.4.5-dev-a2d6638b3b746b54f2dd911c460148e1a1886ab9

@gvanrossum
Copy link
Member

All in all not a bad result!

I'm curious -- are you sure you had to annotate all empty list literals? I'm saying because mypy typically doesn't need this when you write e.g.

# inside a function
xs = []
for x in ...:
    xs.append(x)

However it does need it in some cases where the initialization code is more complex, and AFAIK in all cases where it's a class attribute or global. So that's probably what you're seeing.

@davidfstr
Copy link
Contributor Author

Ah you're correct. Only a few []s had to be annotated. As you say, class attributes and globals in particular.

@davidfstr
Copy link
Contributor Author

One item I forgot to mention:

  • Added __init__.py files to several packages so that they wouldn't be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants