Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ExtendedData to FSharpDiagnostic #15840

Merged
merged 15 commits into from
Sep 13, 2023
Merged

Conversation

DedSec256
Copy link
Contributor

@DedSec256 DedSec256 commented Aug 22, 2023

This PR implements fsharp/fslang-suggestions#1094 and proposes the following:

  • Add property ExendedData: IFSharpDiagnosticExtendedData option to FSharpDiagnostic,
    where IFSharpDiagnosticExtendedData implementations are created from PhasedDiagnostic exceptions and can lazily provide the data needed for quick fixes and other IDE-related things
  • Add several IFSharpDiagnosticExtendedData implementations as a proof-of-concept (and which are already needed to stop parsing error messages for some quick fixes :) )

To do:

  • Add missing docs for the new api
  • Fix CI

@DedSec256 DedSec256 requested a review from a team as a code owner August 22, 2023 17:14
@vzarytovskii
Copy link
Member

Probably needs some docs + examples of usage.

@vzarytovskii
Copy link
Member

vzarytovskii commented Aug 22, 2023

Binary compatibility bothers me a bit here. Will need to think about implications, if we'll need to be making
any changes.

@baronfel @TheAngryByrd thoughts?

@TheAngryByrd
Copy link
Contributor

Binary compatibility bothers me a bit here. Will need to think about implications, if we'll need to be making any changes.

@baronfel @TheAngryByrd thoughts?

From a FSAC point of view, I think we've grown accustomed to dealing with breaking changes, so not too worried there.

The bigger concern would be from some analyzer to be point of view, where if analyzers were relying on some form of structured data, I think FCS would need to publish it for ever, or at least until some understandable breaking change point (and not in a hotfix release). Would have to probably create new types if we need new ways to portray the information. Feels akin to an event sourcing versioning problem.

@kerams
Copy link
Contributor

kerams commented Aug 23, 2023

True enough, but I don't see a way around it unless you want to do IDictionary<string, obj> containing the barest primitives.

@baronfel
Copy link
Member

@TheAngryByrd yes, event sourcing was exactly the problem I had in mind. That's why my initial suggestion was something fairly untyped as the message envelope (IDictionary<string, obj> or IDictionary<string, string>) with 'schemas' of particular string-based keys being probe-able from the consumer's side to prevent typecasting/API-change-based errors.

That's just a looser form of API compatibility problem, though.

@T-Gro
Copy link
Member

T-Gro commented Aug 23, 2023

Binary compatibility bothers me a bit here. Will need to think about implications, if we'll need to be making any changes.
@baronfel @TheAngryByrd thoughts?

From a FSAC point of view, I think we've grown accustomed to dealing with breaking changes, so not too worried there.

The bigger concern would be from some analyzer to be point of view, where if analyzers were relying on some form of structured data, I think FCS would need to publish it for ever, or at least until some understandable breaking change point (and not in a hotfix release). Would have to probably create new types if we need new ways to portray the information. Feels akin to an event sourcing versioning problem.

That is the reason why I suggested using a composable list of lower-level extended data items , instead of an option.
That way, an analyzer could have a logic for searching for a particular item, e.g. "ActualTypeExtendedData".

And it would not fail if the list is extended in the future, neither compile-time nor run-time.

It is similar to an approach with a dictionairy of primitives, but instead of .NET primitives, it would use compiler-domain-primitives, and instead of string-based keys, the naming would come from the name of a type.

From the point of view of this PR, it's not much more to change - turn option into a list, and remodel some of the examples.

@DedSec256
Copy link
Contributor Author

DedSec256 commented Aug 24, 2023

@vzarytovskii,

Probably needs some docs + examples of usage.

The documentation is not yet written, while the api discussions are going on. But of course it will be added.

Small prototype JetBrains/resharper-fsharp@fd9dd3c.
One of the problems that can be solved is related to parsing diagnostic messages for additional information necessary to create strongly typed warnings/errors and quick fixes for them — these hacks stop working when changing localization.

Also, we have a bunch of PRs with new quick fixes:

https://github.com/JetBrains/resharper-fsharp/blob/f04f7326b3888b778238d0a1e0e6283a39f6d532/ReSharper.FSharp/src/FSharp.Psi.Daemon/src/Stages/FcsErrorsStageProcessBase.fs#L319-L325 (from JetBrains/resharper-fsharp#545)

https://github.com/JetBrains/resharper-fsharp/blob/884e2a68eda184534cd349cc893175978df8bcb8/ReSharper.FSharp/src/FSharp.Psi.Daemon/src/Stages/FcsErrorsStageProcessBase.fs#L319-L325 (from JetBrains/resharper-fsharp#549)

https://github.com/JetBrains/resharper-fsharp/blob/432d2009fdda494222184ab2865f77195c7c5678/ReSharper.FSharp/src/FSharp.Psi.Daemon/src/Stages/FcsErrorsStageProcessBase.fs#L319-L329 (from JetBrains/resharper-fsharp#546)

Where instead of error.Message.StartsWith/EndsWith can be used type check for ValueNotContainedDiagnosticExtendedData and data to compare from it:

type ValueNotContainedDiagnosticExtendedData
internal (symbolEnv: SymbolEnv, signatureValue: Val, implValue: Val) =
interface IFSharpDiagnosticExtendedData
member x.SignatureValue = FSharpMemberOrFunctionOrValue(symbolEnv, mkLocalValRef signatureValue)
member x.ImplementationValue = FSharpMemberOrFunctionOrValue(symbolEnv, mkLocalValRef implValue)

As you can see from these examples, a single extended data type (such as TypeMismatchDiagnosticExtendedData) can be used for several different diagnostics, even if they have a different number, but the same (or similar) semantics.

At the same time, different extended data types can be used to restore the diagnostic context in the case of heterogeneous diagnostics under the same number (as, for example, in MissingErrorNumber diagnostics group)

@baronfel,

That's why my initial suggestion was something fairly untyped as the message envelope (IDictionary<string, obj> or IDictionary<string, string>) with 'schemas' of particular string-based keys being probe-able from the consumer's side to prevent typecasting/API-change-based errors.

Currently, FCS provides a strongly typed API for all public subsystems. Quite unexpectedly and confusing to receive in this case a set of raw strings/objects with which additional manipulations must be carried out. At the same time, lazy-computed properties with the necessary data can be located inside a separate type, which can play a role in large sets of additional data. For example, extended data might provide a set of missing elements to implement an interface that might not be needed at all, because no one would call a quick fix for it, but it would still have to be computed and put into a dictionary. With properties, we also can change data calculation to lazy without losing backward compatibility.

@T-Gro,

That is the reason why I suggested using a composable list of lower-level extended data items , instead of an option.
That way, an analyzer could have a logic for searching for a particular item, e.g. "ActualTypeExtendedData".

Together with diagnostic numbers and strongly typed extended data, we can not only get additional data, but also the ability to restore the context of the diagnostic itself.

For example, in this code for heterogeneous diagnostics with a single number, from the ExtendedData type we can restore the context without a message
image

If we want to make a list of ExtendedData, we need to think about how the semantics of diagnostic can be restored from the list? Does the order of the data in this list matter? For example, it is easy to imagine how the developer of some analyzer simply accesses the first element in the list, always consider it appropriate. And it's easy to accidentally break the order of extended data types in a list.

In the current PR, we have the following options, which seem to be no worse than lists and dictionaries

  1. Use existing diagnostic extended data if it makes sense
  2. Add properties to existing diagnostic if necessary
  3. Create a new extended data type for new diagnostics if others do not fit

ExtendedData can also be composed with each other. Types can be composed in a variety of ways, unlike lists and dictionaries – there are no contracts in collections, and the type has an explicit contract.

@baronfel
Copy link
Member

Thanks for the detailed responses @DedSec256. Your point about lazy calculation of properties allowing for minimizing the cost of generating good diagnostic data is a great one - it's conceptually similar the reason why detecting a codefix is relevant and applying the codefix is logically separate in models like the LSP.

I think I'm beginning to be convinced. We already have some amount of API compat issues, but this kind of API compat is entirely in-process and manageable by the editors - it will only appear as each editor updates its FCS dependency. It's not going to start happening by e.g. a .NET SDK update alone. And the mitigations would hopefully be reasonable - only areas that dealt directly with each diagnostic would be impacted. And putting the relevant codefix data directly on the extended data would reduce the amount of duplicated code in each editor (to do things like determine the name of a type to suggest, or similar).

@T-Gro
Copy link
Member

T-Gro commented Aug 25, 2023

Together, diagnostic numbers and strongly typed extended data, we can not only get additional data, but also the ability to restore the context of the diagnostic itself.

Strongly typed data tailored for each diagnostic specifically is of course a superior data-transfer model from the consumer point of view, but it only holds in a closed-world assumption.
That is, the consumer is tightly coupled to the type definitions exposed and possibly updated with every release.

Which I don't believe can hold in an external analyzer scenario, which is on the roadmap.

That being said, could you please mark all the added types as [<Experimental>] ?

@nojaf
Copy link
Contributor

nojaf commented Aug 25, 2023

I want to throw in my plus one for the current proposal.

but this kind of API compat is entirely in-process and manageable by the editors

I'm pretty convinced that the amount of consumers of this exact API will be minimal. Getting a sense of what FCS consumers would actually use this and how they feel about nice typed data should definitely weigh in on the decision here.

As for the compatibility issue, this could be detected early on if public preview FCS releases are consumable for all parties and dropped frequently. I only want to state that this potential problem might not be as painful as you currently think it is.

@DedSec256
Copy link
Contributor Author

DedSec256 commented Aug 25, 2023

Strongly typed data tailored for each diagnostic specifically is of course a superior data-transfer model from the consumer point of view, but it only holds in a closed-world assumption.
That is, the consumer is tightly coupled to the type definitions exposed and possibly updated with every release.
Which I don't believe can hold in an external analyzer scenario, which is on the roadmap.

Existing F# diagnostics almost never change, this means that the ExtendedData Type corresponding to the diagnostic is unlikely to ever change, but, of course, its content may change. Here we can use standard API versioning approaches: leave the old properties for compatibility, mark them with [<Deprecated>], add new properties, which may contain even a separate subtype, and so on.

A loosely typed solution will just silently break external users.

@DedSec256
Copy link
Contributor Author

I think it's ready for review and further discussion

@nojaf
Copy link
Contributor

nojaf commented Sep 5, 2023

I'm genuinely enthusiastic about the potential trajectory here. I've been actively experimenting with incorporating additional data into #15256, and you can check out my work at https://github.com/DedSec256/fsharp/pull/2/files.

The notion of having both the error code and supplementary information seamlessly integrated into the IDE holds significant promise for uncovering optimal refactoring opportunities.

In the example I've provided, I can readily guide users towards considering the inclusion of a type annotation precisely where the ambiguity arises within the record definition.

image

@vzarytovskii
Copy link
Member

vzarytovskii commented Sep 5, 2023

We'll need to go over it next Monday on our review session and discuss implications for the public surface and both future Analyzers SDK and LSP, and how will we go about deprecating stuff if needed.

Copy link
Member

@vzarytovskii vzarytovskii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but a bit on the fence because of how ad-hoc this solution is (not necessary a bad thing), and how much of the new public surface we're exposing.

I don't mind merging it and see how does it look with real compiler diagnostics exposed to the user, but would like to hear others' opinions about it.

@psfinaki
Copy link
Member

psfinaki commented Sep 11, 2023

Hey, I will go thorough this tomorrow to get the idea of the possible implications on the editor side - stay tuned.

Copy link
Member

@psfinaki psfinaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry didn't want to block this. Left some notes - might be misunderstanding something.

@psfinaki psfinaki dismissed their stale review September 12, 2023 15:05

by mistake

@DedSec256
Copy link
Contributor Author

@psfinaki,

Hey this might indeed break a code fix, the RenameParamToMatchSignature one, here. This should be addressed - which might be just changing the diag code in the fix/tests or something more elaborate.
Should be easy since we have tests for this now. But I can help with this if needed.
That said, this should be breaking now then...

As we can see, this test is green after the changes, since the quickfix itself is tied to the diagnostic number, which I did not change.

image

And even more, in the future this code can be replaced by simply getting the names of the arguments from ArgumentsInSigAndImplMismatchExtendedData

let getSuggestion (d: Diagnostic) =
let parts = Regex.Match(d.GetMessage(), ".+'(.+)'.+'(.+)'.+")
if parts.Success then
ValueSome parts.Groups.[1].Value
else
ValueNone

But yeah more importantly - why the removal is needed?

For this error, instead of the general diagnostic exception type with a number and message, I created the separate strict type ArgumentsInSigAndImplMismatch, in order to then create the ArgumentsInSigAndImplMismatchExtendedData for it, by analogy with other more typed errors.

| ArgumentsInSigAndImplMismatch(sigArg, implArg) ->
Some(ArgumentsInSigAndImplMismatchExtendedData(sigArg, implArg))

Not strongly against but not a fan of marker interfaces either. What could be alternative designs here?

Base class? :)
Using the marker interface, the API consumer can find a list of all its implementations and thereby understand which ExtendedData exist for diagnostics. You can also later specify some common properties for all ExtendedData by adding them to this interface/base class.

Copy link
Member

@psfinaki psfinaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DedSec256 thanks for the explanations, it took me some time to understand what's going on here.

I don't have any general objections, things "feel" a bit heavy but I don't have a better design idea and the potential benefits for the tooling outweigh the big API surface for me.

We have this as an experimental so for me it would be important to try building something on top of it to see this in action. Maybe we can then adjust the API on-the-go a bit.

Thanks for this!

@T-Gro T-Gro enabled auto-merge (squash) September 13, 2023 07:33
@DedSec256
Copy link
Contributor Author

Can you, please, mark conversations that are no longer relevant as resolved? :)
So that I can better understand what issues about this PR are relevant and require my changes.

@T-Gro T-Gro merged commit 4dc1f3a into dotnet:main Sep 13, 2023
@DedSec256
Copy link
Contributor Author

Thanks to everyone who took part in this PR (:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

8 participants