Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ruff specs #6

Open
sosey opened this issue Dec 5, 2024 · 6 comments · May be fixed by #11
Open

Add ruff specs #6

sosey opened this issue Dec 5, 2024 · 6 comments · May be fixed by #11
Assignees
Labels
testing Relates to CI/CD, tests, or things that support them

Comments

@sosey
Copy link
Collaborator

sosey commented Dec 5, 2024

We need to decide on the set of specs we want to use with ruff and add them to the template

@sosey sosey self-assigned this Dec 5, 2024
@rknop
Copy link

rknop commented Dec 5, 2024

I would not turn on any automatic reformatting, because my experience with that is that it destroys your code.

Even just with the linter there are some in the defaults that I don't like. The one that I have the most trouble with is E402: https://docs.astral.sh/ruff/rules/module-import-not-at-top-of-file/

While that sounds like a good idea, it's not practical. We have to use various libraries, and some of them behave badly. For example, the NOIRlab queryClient library dies on import if there isn't a .datalab directory underneath your home directory. This gets created when the package is installed, but if you then move home somewhere else later (which will happen in Docker files), you have to recreate that directory. This is bad behavior on the part of queryClient, but if you want to use it, you have to work around it by ensuring that directory exists before the import, and E402 will trip that. This is just one example; we're going to find bad behavior in other libraries where various things have to be set before we can import them.

I am morally opposed to E201 and E202 (no whitespace after open bracket or before close bracket). I know PEP-8 says that you're not supposed to have whitespace there, but PEP-8 opens with "a foolish consistency is the hobgoblin of small minds". (This to my mind is the problem with code formatters in general, and part of why I hate black so much; they value mindless consistency over readability.) I find that unless the call is short (shortish function name, single shortish parameter), it's way more readable to have a space after the opening parentheses and before the closing parentheses. (Whitespace can really improve readability.)

In general with linter rules, there are two issues. The first is, if we're going to make some of these things errors that prevent committing to the archive or merging to main, then they're too restrictive. Even if they're just warnings that the user has to read, "warning fatigue" with lots of feedback that's not actually important means that people's eyes are going to glaze over and they're going to miss the things that really do matter.

So, as a general suggestion, I'd say that we pick the things that really matter, and focus on that. Many of the ruff defaults are fine, but I'd want to spend some time really thinking about all of them before I'd be happy turning them on.

I would also not have it do any automatic reformatting. Flag the style problems, and then either suggest or require that the programmer fix them themselves. Sometimes there's a good reason for violating a formatting rule, and automatic reformatting doesn't understand that.

We do have to realize that if we pick stuff now, it could make life harder if we want to add a restriction later, unless it's one that's not already violated very often.

@rknop
Copy link

rknop commented Dec 5, 2024

I would like to turn on W291, which detects trailing whitespace at the end of a line.

@sosey
Copy link
Collaborator Author

sosey commented Dec 5, 2024

Great feedback @rknop, this will help us figure out what we do want in there. This goes along with deciding on pre-commit use too :)

@rknop
Copy link

rknop commented Dec 5, 2024

OK, I have an extensive list of things that I would include. For a completely different project (the LS4 pipeline), I put these in to see how they'd do with code that I wrote a lot of (and that some other people did some of).

This may be too many things, and may cause too much pain. First, here's the ruff.toml that I'm currently using for that other project:

# Ignore these directories
exclude = [
    "extern/**",
    "webap/rkwebutil/**",
    ".git",
    ".github",
    ".ruff_cache",
    ".vscode",
    "__pypackages__",
    "alembic/**",
    "conductor/rkauth_flask.py",
    "util/archive.py",
    "util/rkauth_client.py"
]

# Formatting standards
line-length = 120
indent-width = 4

# Python 3.11 is the target version
target-version = "py311"

[lint]
preview = true

select = [ 'F', 'E101', 'E111', 'E112', 'E113', 'E115', 'E117',
           'E204', 'E223', 'E224', 'E242', 'E265', 'E273', 'E274', 'E275',
           'E301', 'E302', 'E305', 'E306', 'E401', 'E501', 'E502', 'E703',
           'E711', 'E713', 'E714', 'E72', 'E74',
           'W19', 'W29', 'W39', 'W605',
           'N804', 'N805', 'N807',
           'D206', 'D300', 'D301',
           'UP010', 'UP011', 'UP012', 'UP013', 'UP014', 'UP017', 'UP018', 'UP019',
           'UP02', 'UP030', 'UP031', 'UP033', 'UP034', 'UP035', 'UP036',
           'UP037', 'UP039', 'UP04',
           'PLE0100', 'PLE0101', 'PLE0116',
           'NPY',
           'RUF018',
           'E301', 'E302', 'E306', 'W505', 'D200', 'D212', 'RUF021' ]

Here is the notes I made reading through the RUFF rules. I do have to admit that my eyes glazed over for a while reading through some of the 800 rules, and I kind of skipped some categories...

All F

E101 mixed-spaces-and-tabs -- all the "tabs" thing may be unneded if we use W191
E111 indentation-with-invalid-multiple
E112 no-indented-block
E113 unexpected-indentation
E115 no-indented-block-comment
E117 over-indented

E204 whitespace-after-decorator
E223 tab-before-operator
E224 tab-after-operator
E242 tab-after-comma
E265 no-space-after-block-comment
E273 tab-after-keyword
E274 tab-before-keyword
E275 missing-whitespace-after-keyword
E304 blank-line-after-decorator
E305 blank-lines-after-function-or-class

E401 multiple-imports-on-one-line

E501 line-too-long --> but not with the default line length setting.
       My preference is 120 characters.  I could live with anything
       between 110 and 150.

E502 redundant-backslash --> In fact, I prefer avoiding backslash line
       continuation altogether.  With W291, it's less of a problem.
       But, in my experience, you can always use parentheses when you
       need one line to be over multiple lines, and the backslash thing
       is ugly and hazardous.

E703 useless-semicolon
E711 none-comparison
E713 not-in-test
E714 not-is-test
E721 type-comparison
E722 bare-except
E741 ambiguous-variable-name
E742 ambiguous-class-name
E743 ambiguous-function-name

E902 io-error

W191 tab-indentation
       --> I actually believe that codepoint 9 should be forbidden in
           all plain text files (like source code), or most things
           that are meant to be read in a monospace font.  Tabs are
           for wysiwyg.
W291 trailing-whitespace
W292 missing-newline-at-end-of-file
W293 blank-line-with-whitespace
W391 too-many-newlines-at-end-of-file
W605 invaid-escape-sequence

N804 invalid-first-argument-name-for-class-method
N805 invalid-first-argument-name-for-method
N807 dunder-function-name

D206 indent-with-spaces
D300 triple-single-quotes
D301 escape-sequence-in-docstring

UP008 super-call-with-parameters
UP019 typing-text-str-alias
UP020 open-alias
UP021 replace-universal-newlines
UP023 deprecated-c-element-tree
UP025 unicode-kind-prefix
UP029 unnecessary-builtin-import
UP035 deprecated-import

PLE0100 yield-in-init
PEL0101 return-in-init
PEL0116 continue-in-finally

NPY001 numpy-deprecated-type-alias
NPY003 numpy-deprecated-function
NPY209 numpy2-deprecation --> may be painful, but we should

RUF018 assignment-in-assert

Mixed feelings:

E231 missing-whitespace
       ---> Almost always I want this.  There are edge cases where I
           don't.  I'm inclined to put this in because I've seen too
           many students & post-docs whose style is to have no
           whitespace anywhere in function calls or function
           definitions, and having no spaces after commas makes the
           whole thing really hard to read.

E261 too-few-spaces-before-inline-comment
E262 no-space-after-inline-comment

E301 blank-line-between-methods
     --> I'm in favor of requiring at least one blank space at
         the end of a method.  I'm opposed to requiring no more
         than one space.  In the "whitespace increases readbility"
         world, it's often nice to have two blank lines after the
         end of one method and before the start of the next.
         I *think* this one just checks the first and E303 checks the
         second.  If that's the case, I'd like this one in, but it
         might annoy other people.

E302 blank-lines-top-level
     --> Again, I'm happy with requiring 2, unhappy with forbidding more than
         2; same comment as E301
         
E306 blank-lines-before-nested-definition
     --> Same thing; like enforcing one.  Maybe could be convinced to
         enforce not greater than one, but in general rules that
         prohibit whitespace make me nervous

W505 doc-line-too-long
     --> It's reasonable to limit docstring lines to <80 characters.
         It's not reasonable to limit comments.  The docs on this
         one suggest it might do both.  If it only does it on
         docstrings, I'm OK with this, but if it's going to limit
         comments, I'm not.

D200 fits-on-one-line
D212 multi-line-summary-first line
     --> For both of these, I much prefer this style myself.  I think
         it's probably too anal to demad everybody do it.

NPY002 numpy-legacy-random

RUF021 parenthesize-chained-operators
       --> I really want this.  In fact, for clarity, I'm in favor
           of using lots of extra parentheses in expressions to
           make it clear what's going on if you don't immediately
           remember all the operator precedence rules.  But...
           it may piss other people off.
         




Gratuitous comments:
   E251 and E252 are weird; why spaces in one place and not the other?

@rknop
Copy link

rknop commented Dec 5, 2024

E703 is the "have you been writing a webap and going back and forth between JavaScript and Python a lot" check....

@sosey sosey added the testing Relates to CI/CD, tests, or things that support them label Dec 11, 2024
@benjaminrose
Copy link

I am good with this list. Do we want it in pyproject.toml or ruff.toml?

benjaminrose added a commit that referenced this issue Jan 23, 2025
@benjaminrose benjaminrose linked a pull request Jan 23, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Relates to CI/CD, tests, or things that support them
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants