Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pylint to enforce greater code quality #504

Merged
merged 10 commits into from
Jan 13, 2022
Merged

Conversation

sirosen
Copy link
Member

@sirosen sirosen commented Jan 11, 2022

This changeset introduces pylint.

There are a few issues with pylint:

  1. It has several rules like the refactoring rules which are prone to false positives
  2. Some of its rules are simply too strict (e.g. lazy log formatting) and force a lot of change when it is introduced
  3. Even with a carefully curated ruleset, like any linter it can show false-positives
  4. It's slow! Too slow to run as a pre-commit hook

However, after trying it (again) I find that it discovers several real issues in our code. I'd like us to try to keep pylint happy.
In addition to the config and CI changes, there are the following changes to our codebase:

  • remove unnecessary pass directives
  • sync_level on TransferData now raises a ValueError on unrecognized strings [1]
  • when raising a new exception in a handler, always use raise X from err syntax (chaining)
  • always explicitly use "utf-8" when opening files [2]
  • avoid shadowing names from an external scope when entering a narrower scope
  • document all function parameters -- undocumented parameters will be caught

[1] was changed in 8e5cd81 . The commit message details the full change. This is technically backwards incompatible because the unsupported value is now rejected with a ValueError when TransferData is constructed, not a GlobusAPIError when it is submitted. However, I think we can take the stance that this is a reasonable improvement to a behavior which was not quite right from the start, and therefore is an improvement. Doing this in a minor version still follows the spirit of semver.

[2] a031c38 is a similar case of "spirit of semver, not the letter of it". It is technically possible that someone has written a latin-1 encoded file with non-ascii content via the token storage adapter, and that this will break. However, the chances of this in practice are very slim. Python versions 3.11 or later may actually change to UTF-8 by default anyway, and using the locale-specified encoding is a messy bit of code we'd rather not have in the long term. Similarly, I'm fairly sure that gridmap is ASCII-only. Specifying UTF-8 now simply makes us explicit about our expectations, and is probably what we should do in all cases.

The changelog fragment shows how I would explain this change to users.

Setup a pylint config, set to ignore various formatting related
messages, which otherwise conflict with black.
Given that `*args, **kwargs` are often used as passthroughs to accept
and discard arguments intentionally, these are ignored when unused.
The docparams plugin was tested, but produces many warnings right now.
Disable for the time being.

Address some of the warnings:
- unused `**kw` (update to `**kwargs`)
- redefined `cast` in `env_vars` (also resolve some typing complexity
  with better overloads)
- `raise` rather than `raise ... from ...`
- one undocumented `:raises:` from before disabling docparams
- incorrect unsubscriptable-object is ignored on type subscripting
- The type annotation for `sync_level` now includes `int` as an option
- If the sync_level is an unknown string, a ValueError will be raised,
  rather than allowing the string to be passed to the API
- An unknown int is still passed through
- A new test checks this behavior
Add exception chaining in the `by_scopes` lookups for token
responses and b32 decoding in GCP owner-info parsing.
Fixme comments, use of soft keywords, and inaccurate warnings are
disabled.
Whenever `open()` is called, pass `encoding="utf-8"`. This is the
de-facto standard for text encoding and is unlikely to cause users any
issues.

The most likely cases for file IO are ASCII-only. Although this is
technically minorly backwards incompatible, chances are it will have
no deleterious impact on users. Furthermore, there are changes in
behavior for python which are happening around bare `open()` calls:
  https://www.python.org/dev/peps/pep-0597

As PEP597 notes, the previous default behavior could be requested
explicitly with

    open(..., encoding=locale.getpreferredencoding(False))

However, this assumes that we want to make the locale-specified
encoding the default for the SDK. That is not the case -- it would be
better to always use UTF-8 and be explicit where we care.

The only case today in which this may be incorrect is GridFTP gridmap
for GCP. If it turns out that this is encoded in latin-1 or the user's
preferred encoding, we will adjust the encoding used to read the file
to match the GridFTP behavior. This is assumed to not be a problem
until proven otherwise.
Enable the docparams plugin, comment on pylint ignore rules, and add
pylint to our `make lint` step.
All docparams-flagged issues are resolved.
Update the github actions primary build workflow, doing various bits
of cleanup and ensuring that `make lint` is run (so that `mypy` and
`pylint` are enforced)
@sirosen sirosen merged commit 792f55d into globus:main Jan 13, 2022
@sirosen sirosen deleted the support-pylint branch January 13, 2022 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant