Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python] Add support for type comments and annotations #3736

Merged
merged 18 commits into from
Jun 25, 2023

Conversation

deathaxe
Copy link
Collaborator

@deathaxe deathaxe commented May 10, 2023

Supersedes #1925

This PR proposes to add dedicated contexts for type hints.

It provides highlighting for type comments including feedback from #1925.

Existing type annotations are refactored to make use of type-expressions and follow a common scope naming scheme for type hints. It is inspired by PHP, TS and TSX, which use meta.type to scope type hint expressions.

Known type classes from typing module are scoped as support.class.typing.

deathaxe added 8 commits May 10, 2023 15:12
This commit ...

1. uses `type-hint-body` context for variable type annotations
2. replaces the scope by the more general `meta.type.python`, which is used by
   type comments already. It is originally inspired by PHP's type hints, which
   also use `meta.type`.

Python doesn't scope variables `meta.variable` in general. Hence former chosen
`meta.variable.annotation` is a kind of compromise to follow the scheme of
function definition scopes.

A common `meta.type` scope should help color schemes to maintain consistent
highlighting for all type hints.
This commit...

1. replaces `annotation` by `type` in scope names
2. scopes function return types `meta.function.return-type` which is used in
   most other syntaxes. The annotation character is specified via `meta.type`.

The overall goal is to apply scope naming scheme for type hints, which is
implemented already in PHP, TypeScript and TSX, using `meta.type`.

Renamed contexts should clarify this syntax explicitly targets type hints in
function/parameter/variable annotations.
This commit restricts `meta.type` to cover type expressions without leading or
trailing whitespace. It results in better UX in case a background color is
assigned by a color scheme.
This commit interprets brackets in types as lists of type members.
Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good in general, except that they also prevent other uses of annotations which are perfectly valid Python but now show as invalid. I've provided a few examples below.

I'll see if I can try this out on some of my code over the next week.

Python/tests/syntax_test_python.py Outdated Show resolved Hide resolved
Python/tests/syntax_test_python.py Outdated Show resolved Hide resolved
Python/tests/syntax_test_python.py Outdated Show resolved Hide resolved
@deathaxe
Copy link
Collaborator Author

AFAIK, annotations are basically just something like comments, thus nearly any content is valid, syntactically.

Yes, this PR restricts annotations to type hints, as this is the most common use for them, these days.

@FichteFoll
Copy link
Collaborator

FichteFoll commented May 12, 2023

Yes, this PR restricts annotations to type hints, as this is the most common use for them, these days.

I agree that type hints are the most frequently used use case, but I don't agree with highlighting other uses as invalid. Invalid highlighting is very disruptive and should only be used when the lexer is confident that it is indeed invalid (syntax), which is not the case here.
In order to provide a workable alternative here, I believe we'd either need to add branching support to only apply meta.type when type hint parsing didn't encounter an error, e.g. an unexpected keyword, or simply be less aggressive about invalid highlighting by also allowing all other expressions (while still applying the meta.type scope to them).

@deathaxe
Copy link
Collaborator Author

deathaxe commented May 12, 2023

TBH, I don't want to add branching for that, as it overcomplicates things.

The alternative is to just revert annotations to use expressions-in-a-group etc. with the cost of inconsistent scoping of types between type comments and annotations with regards to square brackets for instance. Those are scoped meta.item-access in normal expressions, but meta.sequence in types as the former doesn't make sense in a type definition.

I'd still scope them meta.type as the primary use is to give color schemes the ability to dimm type-hints or assign backgrounds with a single scope, rather than needing to address all the different scopes such as meta.function.parameters.annotation, meta.function.annotation.return, meta.variable.annotation, ... .

Ideally we would use meta.annotation, but that's already used for decorators and addressed in color schemes accordingly. This may cause really strange highlighting in some of them.

This commit partly reverts restrictions to allow only type-hints in annotations.

For sake of consistent highlighting brackets are scoped `meta.brackets` instead
of `meta.item-access` and `meta.sequence.list.members`, so chances are lower for
equally looking terms to be higlighted differently.

Annotations are still scoped `meta.type` as this is the primary use case and the
scope is to be used by color schemes for dimming type hints easily.
@frou
Copy link
Contributor

frou commented May 12, 2023

In the following, it would be good if modulename and TypeName are given distinguishable scopes, so that a Color Scheme can choose to highlight only TypeName as a type instead of the whole thing:

def foo() -> modulename.TypeName:
    pass

Similar to a previous change which was welcome: PR #3421

@deathaxe
Copy link
Collaborator Author

Ideally, yes. Not so easy as annotations are expected to support ordinary expressions, where the qualifier part could be anything (variable, module, class, ...).

@rchl
Copy link
Contributor

rchl commented May 12, 2023

I'd still scope them meta.type as the primary use is to give color schemes the ability to dimm type-hints or assign backgrounds with a single scope

Sorry if this is not a question entirely relevant to this PR but how would one do that easily?
With this PR, if I add:

        {
            "scope": "meta.type",
            "foreground": "color(rgb(255, 255, 255) alpha(0.4))",
        },

to my color scheme then it only grays out some of the types and not the other. Probably because of scope priorities.

Screenshot 2023-05-12 at 22 07 49

@rchl
Copy link
Contributor

rchl commented May 12, 2023

Interesting that the types are now highlighted even in imports:

Screenshot 2023-05-12 at 22 10 55

Don't have any strong opinions about it. I guess it makes sense to be a bit more semantic, as long as there won't be false positives.

Here is VSCode with its semantic highlighting.

Screenshot 2023-05-12 at 22 12 36

Not sure if I hate or just don't like it ;)

@michaelblyons
Copy link
Collaborator

michaelblyons commented May 12, 2023

I'd still scope them meta.type as the primary use is to give color schemes the ability to dim type-hints or assign backgrounds with a single scope

Sorry if this is not a question entirely relevant to this PR but how would one do that easily?
With this PR, if I add:

Your later scopes are coloring things overtop the gray applied by meta.type. From before the PR, I had this tweak, which seemed to work even for Nones or strs whose coloration scopes come after the selector I was use below.

        {
            "name": "Python type annotation",
            "scope": "source.python meta.function.parameters.annotation, source.python meta.function.annotation.return",
            "background": "var(blue3)",
            "foreground_adjust": "saturation(- 40%) lightness(- 20%)",
        },

Demonstration of 40% desaturated Python type annotations

Obviously, you could desaturate all the way down or whatever, but I find that keeping a little color (but very de-emphasized) is nice.

@deathaxe
Copy link
Collaborator Author

deathaxe commented May 13, 2023

How specific a scope selector of a color scheme customization needs to be, depends on the underlying color scheme. I mainly use a (heavily) customized Brackets color scheme with following rules to make type hints stand out. Basically the same idea as Michaels. I just found it annoying to need to target all the different meta.function.parameters.annotation, meta.function.annotation.return ... scopes individually. Hence meta.type should target them all.

		// Python type annotations

		{
			"name": "Annotation Background",
			"scope": "meta.type",
			"background": "color(var(textcolor) blend(var(background) 4%))",
		},

		{
			"name": "Python Type Annotations: Types",
			"scope": "meta.type & (meta.generic-name, support.class, support.type, storage)",
			"foreground": "color(var(grey) blend(var(blue) 70%))",
		},

		{
			"name": "Python Type Annotations: Constants",
			"scope": "meta.type & (constant.language, constant.other, string)",
			"foreground": "color(var(grey) blend(var(orange) 70%))",
		},

Interesting that the types are now highlighted even in imports

This PR adds all exported classes/types from python 3.11's typing module to the globally applied list of predefined types, such as str, dict etc. as they are provided by the stdlib.

@rchl
Copy link
Contributor

rchl commented May 13, 2023

A scope like meta.type & (meta.generic-name, support.class, support.type, storage, constant.language, punctuation.section) seems to cover all cases I've seen so far (but probably still misses some cases). It's indeed bad that the rule can't be made color scheme agnostic. I think we should have something like !important. Even if it will be hated later, it sounds like an only way to fix that case.

EDIT: Also in TS it doesn't even seem possible to override this region (the Record part) added by LSP semantic highlighting:

Screenshot 2023-05-13 at 11 53 29

It's a support.class.js scope and adding it to the rule doesn't help.

@michaelblyons
Copy link
Collaborator

michaelblyons commented May 13, 2023

@rchl Did you try foreground_adjust (also requires setting background) like my example?

Edit: Hmm. I don't know if it can override LSP extras.

@rchl
Copy link
Contributor

rchl commented May 13, 2023

@rchl Did you try foreground_adjust (also requires setting background) like my example?

Edit: Hmm. I don't know if it can override LSP extras.

Doesn't help with overriding semantic regions.

@jwortmann
Copy link
Contributor

With the builtins from the typing module being scoped support.class.typing, would it be better to adjust the scope for the built-in exception classes to support.class.exception now? They don't need to be imported, but I would say due to the PascalCase names they feel closer to the library classes, than to the intrinsic types like int, str, etc. (which are support.type.python at the moment...).

scope: support.type.exception.python

@deathaxe
Copy link
Collaborator Author

Makes sense.

deathaxe added 2 commits May 14, 2023 10:58
With classes from typing module being scoped `support.class` it makes sense to
scope exception classes this way, too.

So all CamelCase types are organized under the same 2nd-level scope.
Scope naming guidelines request basic builtin types such as `int` to be scoped
`storage.type` as `support` scope is dedicated for library or user defined
types.

This commit changes scope of builtins accordingly as a result of former changes
of type scopes.
@keith-hall
Copy link
Collaborator

Question: Does it make sense to support highlighting type comments when they are already deprecated? microsoft/pyright#3557

@FichteFoll FichteFoll dismissed their stale review May 23, 2023 11:46

Resolved

FichteFoll
FichteFoll previously approved these changes Jun 4, 2023
Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used the wrong button earlier.

@deathaxe
Copy link
Collaborator Author

deathaxe commented Jun 4, 2023

I fundamentally disagree with scoping identifiers the same for the type definition/declaration and the namespace/constructor use cases.

As {{builtin_types}} has been used once, all builtin types have been scoped/highlighted equaly in all locations / use cases before. Hence I am not sure I can follow.

TL;TR #1842 : Can you give any example?

@FichteFoll
Copy link
Collaborator

TL;TR #1842 : Can you give any example?

Yes.

def stringify(x: int) -> str:
    return str(x)

def shadow_update_dict(d1: dict, d2: dict) -> Mapping:
    return ChainMap(d2, d1)

In this example, str is being used as a type (in the type annotation) but also as a constructor. Or ChainMap is used as a constructor and Mapping as a type. Syntactically, types have a different meaning if they are used inside the type context than inside an expression and this also applies to statically typed languages like C.


That said, which scope guideline are you referring to in fc93275 (#3736) because the ST ones probably couldn't be it as they are … kind of … weird and could definitely use an update.

storage.type

Keywords that affect the storage of a variable, function or data structure should use the following scope. Examples include static, inline, const, public and private.

storage.modifier

Keywords for functions or methods should use the following scopes. Example keywords include func, function and def. This includes storage.type for backwards compatibility with older color schemes.

https://www.sublimetext.com/docs/scope_naming.html#storage

So, I assume you're referring to the TM ones, which I guess are still the most recent guidelines that have been written down and at least kind of make sense. I should really get SNG going at some point …

@jwortmann
Copy link
Contributor

as they are … kind of … weird and could definitely use an update

Your excerpt of the guidelines is a bit unconventional as well, as it might suggest that the listed scopes were the headings, followed by a description. It makes more sense if you adjust the quoted region like it is actually meant to be read 😆 :

Types should use the following scope. Examples include int, bool and char.

storage.type

Keywords that affect the storage of a variable, function or data structure should use the following scope. Examples include static, > inline, const, public and private.

storage.modifier

So I would say that storage.type for the intrinsic classes with lowercase first letter like int, str, etc. is not wrong. But a possible improvement could be to use storage.type when they appear "standalone" like in isinstance(int, ...), or in a type hint, but use another scope like support.function when they are used as a constructor, e.g. str(x) or dict(), which is indeed a function. This would also be in line with your ChainMap(d2, d1) example, which already uses the variable.function scope. I don't know whether or how much it would increase the complexity of the syntax definition, but I'd imagine it should be quite simple to implement; just check whether the type is followed by (.

@FichteFoll
Copy link
Collaborator

Your excerpt of the guidelines is a bit unconventional as well, as it might suggest that the listed scopes were the headings, followed by a description.

Which is exactly how I read it – multiple times at that – but I see now that this is not the way they were inteded to be read. Still, I still thing the section is weird.

So I would say that storage.type for the intrinsic classes with lowercase first letter like int, str, etc. is not wrong.

Yes, I didn't say it was wrong, it's just not something I personally agree with.

I don't know whether or how much it would increase the complexity of the syntax definition, but I'd imagine it should be quite simple to implement; just check whether the type is followed by (.

The same thing is already applied for function calls, as you mentiond, so it really shouldn't be that much of an issue. At this point, it's mostly the should part that needs to be discussed, not the could.

@deathaxe
Copy link
Collaborator Author

deathaxe commented Jun 5, 2023

The intent behind storage.type for int, ... is to

  1. follow the guildelines
  2. increase number of syntaxes applying this scheme
  3. provide a distinction between builtin types and those shipped by stdlib (e.g.: typing and exceptions).
    This would would probably work with support.type vs. support.class as well.

Types being scoped the same in declarations and constructors is not introduced by this PR, but rather a long standing implementation detail as well as scoping tokens as builtins/keywords in locations they are not treated as such by the interpreter.

What I'd be concerned about in general is trying to add too much semantic details to syntaxes or scoping keywords differntly depending on where or in which context they appear. It is easy for some syntaxes, but there are as many syntaxes it would require sophisticated context structures and rules if even doable at all, reliably - not even thinking about ambiguities, which can't be resolved by a static lexer.

Actually instantiation expressions (e.g.: a = str(x)) are handled by ordinary function-call contexts in Python. Thus scoping all constructors as ordinary variable.function or anything else would be an easy change.

I am however uncertain whether scoping the same kind of instantiation expression in C++/Java (e.g.: a = new string(x)) as ordinary function-call would be appropriate.

Another aspect to consider are type casts such as a = string(c). Sure, they can (as every expression) be interpreted as function calls. But actually I'd rather expect string to look like a data type in that situation.

@FichteFoll
Copy link
Collaborator

Let's not stride too much into #1842 territory here. The only point I was trying to make with my comment is the following:

I wonder if it wouldn't be better to rather stay on the defensive side here and only change the scopes of long-untouched parts of syntaxes once we have a decision to not disrupt user workflows potentially twice.

I don't have a strong opinion on it but I wanted to raise it regardless to hear your thoughts on the matter. (However, if you're strongly inclined to highlight types the same always, that would imply this to be the only necessary change, so I suppose you'd rather be in support of changing it now.)

@deathaxe
Copy link
Collaborator Author

deathaxe commented Jun 7, 2023

I don't want to introduce/enforce changes other members don't feel comfortable with. Just an ordinary contributor here as anyone else.

It's probably not a big deal to stay with the old scopes - may proable even avoid unexpected visual changes.

However the only other syntaxes which scope basic builtin types support.type are Go, Js/TS and maybe Rust. Go even stacks storage.type and support.type in some locations but not all. I am not familiar enough with JS/TS and Rust for real judgement at the moment.

All other syntax definitions seem to follow the builtin = storage.type vs. lib/user = support.type scheme.

Code seems to look a bit inferior with Mariana color scheme as color for keyword and storage.type is the same. Situation would improve if constructors were scoped as normal functions. Don't see much difference in Monokai and haven't so in the one I am primarily using.

Are there other oppinions?

michaelblyons
michaelblyons previously approved these changes Jun 12, 2023
Copy link
Collaborator

@michaelblyons michaelblyons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have been fine either way.

FichteFoll
FichteFoll previously approved these changes Jun 12, 2023
@deathaxe deathaxe added the on hold Something that can't be done, or done correctly, right now label Jun 19, 2023
@deathaxe
Copy link
Collaborator Author

Reverting annotations to normal expressions breaks adjusted LSP-pyright syntax and has some unwanted side effects with regards to scoping | in annotations. It is scoped different, depending on being in top-level or within a group or list.

Some more adjustments seem to be required to get things inline.

This commit scopes `|` as normal arithmetic operator in type comments for
consistency reasons, because type hints in annotations (variables, parameters,
return-type) need to use normal expression syntax.
@deathaxe deathaxe dismissed stale reviews from FichteFoll and michaelblyons via 3e550f4 June 19, 2023 16:45
michaelblyons
michaelblyons previously approved these changes Jun 19, 2023
Copy link
Collaborator

@michaelblyons michaelblyons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of icky, but okay.

FichteFoll
FichteFoll previously approved these changes Jun 19, 2023
Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can definitely iterate on type the union operator later if we discover a good solution.

This commit introduces some ...-annotation-content contexts to allow 3rd-party
syntaxes to easily extend annotation expressions or replace them by type-hints,
with loose coupling and without degrading context bailouts.
@deathaxe deathaxe dismissed stale reviews from FichteFoll and michaelblyons via 87866b1 June 23, 2023 16:58
@deathaxe deathaxe removed the on hold Something that can't be done, or done correctly, right now label Jun 25, 2023
@deathaxe deathaxe merged commit a206c7d into sublimehq:master Jun 25, 2023
@deathaxe deathaxe deleted the pr/python/type-hints branch June 25, 2023 16:15
frou added a commit to frou/Humid that referenced this pull request Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants