Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CodeActions for Number Constants: Convert between bases, Add digit group separators #1167

Merged
merged 15 commits into from
Sep 24, 2023

Conversation

Booksbaum
Copy link
Contributor

Some Code Actions for number SynConsts:
(inspired by dotnet/fsharp#3043 & dotnet/fsharp#3044)

Convert between bases/formats:

  • Convert between int bases (decimal, hex, octal, binary):
    ConvertBetweenIntBases
    • supports all ints (sbyte, byte, int16, uint16, int, int32, uint, uint32, nativeint, unativeint, int64, uint64) and floats (float, float32) in hex/oct/bin form
  • Convert between char formats (char, decimal, 3 hexs):
    ConvertBetweenCharBases
  • Note: no Convert between decimal & scientific notation for floats: not precise/too precise/not lossless with float value and I wasn't in the mood to implement conversion based on strings....
  • Note: no Convert between float & int notation for floats: issues with precision/rounding
  • Note: I try to preserve existing signs (including unnecessary ones like a +). But sometimes it's necessary to add or change the sign
    • for example with negative bit: 0b1010_0101y -> -91y: decimal requires a minus sign
    • But also means: not reversible: 0b1010_0101y -> -91y -> -0b1011011y

Replace with Named Constant (MinValue, MaxValue, ±infinity, nan, Epsilon):
ReplaceWithInfinity

Extract/Integrate Minus:
IntegrateExtractMinus

  • for Hex/Oct/Bin formats
  • Additional Actions for other shenanigans with signs:
    UseImplicitPlus
    RemoveAdverseMinus

Add digit group separators:

  • for ints: group length based on format:
    AddSeparatorToDecimalInt
    • group lengths:
      • decimal: thousands (3)
      • hex: words (4), bytes (2)
      • octal: 3 (is there a special name?)
      • binary: nibbles (4), bytes (8)
  • for floats: always 3-digit-groups, and always for all parts (int, decimal, exponent):
    AddSeparatorToFloat
    • when float in int form: see ints
  • when at least one group separator exists: QuickFix to remove separators:
    RemoveSeparator

Pad binary with 0s:
PadWithZeros2

  • Available target lengths: 4, 8, 16 bits
  • Or should I change this to pad to next full 4 or 8 bit?
    (So if I have 0b101 I would Pad to next 8 bit to get 0b00000101 and then additional Pad to next 8 bit to get 0b0000000000000101)



Some Additional Notes:

  • To prevent errors with newly added sign I add a space when operator char immediately before:
    //             -91y
    let value = 5y+0b1010_0101y
    
    // => Convert to decimal
    
    // without space:
    let value = 5y+-91y
    //            ^^
    //            The type 'sbyte' does not support the operator '+-'
    
    // with space:
    let value = 5y+ -91y
    But I'm not sure I handle all cases a new sign might lead to invalid code.
    I currently check for leading !$%&*+-./<=>?@^|~ (F# operator chars).
    Are there any other cases a number is valid immediately after something else but with additional sign it turns into an error?

Add CodeFix: Convert to other base/form for int, char, float (in hex/oct/bin form)

Add CodeFix: Extract/Integrate Minus (in hex/oct/bin from)

Add CodeFix: Replace with +-infinity & nan

Add CodeFix: Add Digit Group Separator for int, float

Add CodeFix: Pad binary with leading `0`s

Note:
QuickFixes might lead to invalid code:
  ```fsharp
  //             -91y
  let value = 5y+0b1010_0101y

  // => Convert to decimal

  let value = 5y+-91y
  //            ^^
  //            The type 'sbyte' does not support the operator '+-'
  ```
  * I think that's acceptable because:
    * rare
    * error is close to cursor
    * easy to fix by user

Note:
QuickFixes not available for enum values, except when direct constant:
  ```fsharp
  type MyEnum =
    | Alpha = 42      // CodeAction
    | Beta = (42)     // no CodeAction
    // requires preview
    | Gamma = (1<<7)  // no CodeAction
  ```
  * Reason: there's currently no easy way to walk into `SynExpr` for enums -> only direct `SynConst`s are handled
If there's an existing operator char immediately before the new sign, a space will be added.

```fsharp
//             -91y
let value = 5y+0b1010_0101y

// => Convert to decimal

// prev:
let value = 5y+-91y
//            ^^
//            The type 'sbyte' does not support the operator '+-'

// now:
let value = 5y+ -91y
```
Note: `Base` & `CharFormat` are public because they are used in Tests
Handled Elements: Parens & App (and of course Const)
Additional: Update Fantomas
Note: not via Fantomas: changes to much to something less readable
Note: Double Quotation marks remain unescaped
Comment on lines 688 to 768
/// Returns:
/// * `None`: unhandled `SynConst`
/// * `Some`:
/// * Simple Name of Constant Type: `SynConst.Double _` -> `Double`
/// * `FSharpType` matching `constant` type
/// * Note: `None` if cannot find corresponding Entity/Type. Most likely an error inside this function!
let tryGetFSharpType
(parseAndCheck: ParseAndCheckResults)
(constant: SynConst)
= option {
//Enhancement: cache? Must be by project. How to detect changes?

let! name =
match constant with
| SynConst.Bool _ -> Some <| nameof(System.Boolean)
| SynConst.Char _ -> Some <| nameof(System.Char)
| SynConst.Byte _ -> Some <| nameof(System.Byte)
| SynConst.SByte _ -> Some <| nameof(System.SByte)
| SynConst.Int16 _ -> Some <| nameof(System.Int16)
| SynConst.UInt16 _ -> Some <| nameof(System.UInt16)
| SynConst.Int32 _ -> Some <| nameof(System.Int32)
| SynConst.UInt32 _ -> Some <| nameof(System.UInt32)
| SynConst.Int64 _ -> Some <| nameof(System.Int64)
| SynConst.UInt64 _ -> Some <| nameof(System.UInt64)
| SynConst.IntPtr _ -> Some <| nameof(System.IntPtr)
| SynConst.UIntPtr _ -> Some <| nameof(System.UIntPtr)
| SynConst.Single _ -> Some <| nameof(System.Single)
| SynConst.Double _ -> Some <| nameof(System.Double)
| SynConst.Decimal _ -> Some <| nameof(System.Decimal)
| _ -> None

let isSystemAssembly (assembly: FSharpAssembly) =
match assembly.SimpleName with
// dotnet core
| "System.Runtime"
// .net framework
| "mscorlib"
// .net standard
| "netstandard"
-> true
| _ -> false

let assemblies = parseAndCheck.GetCheckResults.ProjectContext.GetReferencedAssemblies()
let ty =
assemblies
|> Seq.filter (isSystemAssembly)
|> Seq.tryPick (fun system -> system.Contents.FindEntityByPath ["System"; name])
|> Option.map (fun ent -> ent.AsType())

// Note: `ty` should never be `None`: we're only looking up standard dotnet types -- which should always be available.
// But `isSystemAssembly` might not handle all possible assemblies with default types -> keep it safe and return `option`

return (name, ty)
}
/// Fix that replaces `constantRange` with `propertyName` on type of `constant`.
///
/// Example:
/// `constant = SynConst.Double _` and `fieldName = "MinValue"`
/// -> replaces `constantRange` with `Double.MinValue`
///
/// Tries to detect if leading `System.` is necessary (`System` is not `open`).
/// If cannot detect: Puts `System.` in front
let replaceWithNamedConstantFix
doc
(pos: FcsPos) (lineStr: String)
(parseAndCheck: ParseAndCheckResults)
(constant: SynConst)
(constantRange: Range)
(fieldName: string)
(mkTitle: string -> string)
= option {
let! (tyName, ty) = tryGetFSharpType parseAndCheck constant
let propCall =
ty
|> Option.bind (fun ty ->
parseAndCheck.GetCheckResults.GetDisplayContextForPos pos
|> Option.map (fun displayContext -> $"{ty.Format displayContext}.{fieldName}")
)
|> Option.defaultWith (fun _ -> $"System.{tyName}.{fieldName}")
let title = mkTitle $"{tyName}.{fieldName}"
let edits = [| { Range = constantRange; NewText = propCall } |]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an easier way to detect if System is open? Or Format some code string at a certain location?

All I want is: Format 'System.Int.MaxValue' but only put System in front if necessary.

What I currently do: Find "System" Assembly by looking through all Assemblies and then get System.Int to use for formatting (with DisplayContext).
... That's rather ... involved ...

Comment on lines 76 to 91
type private Int =
static member inline abs(n: sbyte): byte =
if n >= 0y then byte n else byte (0y - n)
static member inline abs(n: int16): uint16 =
if n >= 0s then uint16 n else uint16 (0s - n)
static member inline abs(n: int32): uint32 =
if n >= 0l then uint32 n else uint32 (0l - n)
static member inline abs(n: int64): uint64 =
if n >= 0L then
uint64 n
else
// unchecked subtraction -> wrapping/modular negation
// -> Negates all numbers -- with the exception of `Int64.MinValue`:
// `0L - Int64.MinValue = Int64.MinValue`
// BUT: converting `Int64.MinValue` to `UInt64` produces correct absolute of `Int64.MinValue`
uint64 (0L - n)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to get a generic abs': 'int -> 'uint? (abs' just calls corresponding Int.abs)
Extension Methods cannot (yet) be used with SRTPs. So I need to redirect to my Int type. But then I couldn't manage to correctly resolve abs' without additional hints

Best I managed was:

let inline abs'<'Int, 'int, 'uint when 'Int :> Int and 'Int:(static member abs: 'int -> 'uint)> (value: 'int) : 'uint =
  'Int.abs value

But abs' -5 does not work. I have to either give a return type hint or specify generic types (as wildcard...)

// what I want:
printfn "|%i| = %i" -5 (abs' -5)
//                      ^^^^
//                      error FS0071: Type constraint mismatch when applying the default type 'Int' for a type inference variable. 
//                            This expression was expected to have type 'int' but here has type 'uint32'

// what works:
printfn "|%i| = %i" -5 (abs' -5 : uint)
printfn "|%i| = %i" -5 (abs'<_,_,_> -5)

Not something I use here in this PR (or usually somewhere else). But I used this PR to experiment a bit (too much...) with Span, inline and Generic Numbers. And abs' is one of the cases that came up, but couldn't get the result I wanted :(

(Another issue was with MaxValue & MinValue: Seems like SRTP doesn't support Constants/Fields?)

Copy link
Member

@TheAngryByrd TheAngryByrd Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotta look to F#+ for all the real hard work. I hate that I know how to do this.

type MathU =

  static member inline AbsU(n: sbyte,_method : MathU) : byte = if n >= 0y then byte n else byte (0y - n)
  static member inline AbsU(n: int16,_method : MathU) : uint16 = if n >= 0s then uint16 n else uint16 (0s - n)
  static member inline AbsU(n: int32,_method : MathU) : uint32 = if n >= 0l then uint32 n else uint32 (0l - n)
  static member inline AbsU(n: int64,_method : MathU) : uint64 =
    if n >= 0L then
      uint64 n
    else
      // unchecked subtraction -> wrapping/modular negation
      // -> Negates all numbers -- with the exception of `Int64.MinValue`:
      //        `0L - Int64.MinValue = Int64.MinValue`
      //        BUT: converting `Int64.MinValue` to `UInt64` produces correct absolute of `Int64.MinValue`
      uint64 (0L - n)

  static member inline AbsU(n: nativeint,_method : MathU) : unativeint = if n >= 0n then unativeint n else unativeint (0n - n)

  static member inline Invoke (x)  : 'Output =
      let inline call (mthd: ^M, source: 'Input) = ((^M or ^Input) : (static member AbsU : 'Input * 'M -> 'Output) source, mthd)
      call (Unchecked.defaultof<MathU>, x)


let inline absU x = MathU.Invoke x

let doIt () =

  let byteEx = (absU -5y)
  let uint16Ex = (absU -5s)
  let uint32Ex = (absU -5l)
  let uint64Ex = (absU -5L)
  let unativeIntEx = (absU -5n)

  printfn "%A" byteEx
  printfn "%A" uint16Ex
  printfn "%A" uint32Ex
  printfn "%A" uint64Ex
  printfn "%A" unativeIntEx
  ()

doIt()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok ... I think I stick with passing the concrete abs/AbsU around ....

But always interesting to see what's possible with F#. Thanks 👍

Required for CI...
@Booksbaum
Copy link
Contributor Author

Ok...CI errors are because fantomas now respects .editorconfig (was incorrect pattern match for fs-files)
-> quite a lot of files need reformatting...

I just apply new formatting too all. If that's not ok we should adjust settings in .editorconfig

Was necessary because fantomas now uses settings in `.editorconfig`
(was incorrect pattern match for fsharp files)

`Formatted │ 32 │ Ignored │ 0 │ Unchanged │ 119 │ Errored │ 0`
In the long run: try make unit tests run faster (test CodeFixes directly (and individually) instead via full LSP?)
Copy link
Member

@TheAngryByrd TheAngryByrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!

@baronfel any input on this? Otherwise I think this is good to go.

Copy link
Contributor

@baronfel baronfel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work, and as always incredible test coverage. Thank you for this, @Booksbaum!

FSAC_TEST_DEFAULT_TIMEOUT : 120000 #ms, individual test timeouts
timeout-minutes: 30 # we have a locking issue, so cap the runs at ~20m to account for varying build times, etc
timeout-minutes: 40 # we have a locking issue, so cap the runs at ~20m to account for varying build times, etc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it concerns me a tiny bit that we keep climbing on this :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically doubled the test suite with NamedText/RoslynSourceText. Might be worth switching to RoslynSourceText by default then eventually removing NamedText

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so this isn't because individual tests are hanging, it's because the suite overall is going long? that's more understandable. I'd agree with both of the things you mention.

I'm guessing there's some Expecto cancellation weirdness that's preventing us from having a truly per-test timeout?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there's some Expecto cancellation weirdness that's preventing us from having a truly per-test timeout?

I think this does work (I see some tests fail at 2 mins) but we were hitting the overall timeout we set on the CI. I know I had to make a lot of tests sequential to keep the test suite passing consistently and that's a big part of it being slow too. Probably need to revisit the setup and see why things tend to not be thread safe. Might simply be we're not making an LSP server per test or something else.

I do vaguely remember the "wait for parse/check notifications" is how a lot of tests know how to continue after requesting a typecheck which can fail and not provide the correct feedback or gets lost in the logging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing there's some Expecto cancellation weirdness that's preventing us from having a truly per-test timeout?

Yes, individual timeouts do work. It's "just" the overall timeout we run into.

I basically doubled the test suite with NamedText/RoslynSourceText. Might be worth switching to RoslynSourceText by default then eventually removing NamedText

I think that's reasonable. They should act the same, so just using one should be enough (and then maybe some tests to ensure they really act the same). Currently we're practically doing all tests twice -- despite not actually testing the XXXText, but the server functionality.
Locally the first thing I do before running tests is always uncomment NamedText -- which basically cuts the test time in halve.
Other "multipliers" are already uncommented ([1] [2]).


Probably need to revisit the setup and see why things tend to not be thread safe.

I think it's just the initialization and destruction of LSP server that must be sequential. The tests themselves should be able to run in parallel -- but are currently in sequenced. (Though moving tests into a nested testList doesn't seem to run them in parallel/affect performance? -> linked code doesn't seem to be limiting factor :/ )

Though there might also some issues with getting the correct notifications when tests are run in parallel (I think I remember an issue with that -- though might also be something outdated. I think notifications are now filter by file. So notifications should be assigned to correct tests.)


Might simply be we're not making an LSP server per test or something else.

We're already sharing a server per "logical group" (for example: per CodeFix -- but not for each individual test inside each CodeFix).
Though pushing that further outward might be good for speed. I remember just reusing a document (docTestList) has some noticeable (but not huge) impact. And a single doc should be way more light than a full LSP server.

We could use just one LSP server for all tests. But we're creating a lot of documents -- which I think get all cached in server? Also notifications from LSP Server to test environment are all cached. So might not actually be faster, but more memory intensive -- and definitely more difficult to debug.
Also: some tests require extra settings -> must get their own LSP server.

I think keeping the existing "One LSP server per logical group" makes most sense: State limited to logical context. And LSP settings explicit for that context.
(Though some behind the scene server caching and clearing a server state instead of throwing away might be reasonable too?)


I do vaguely remember the "wait for parse/check notifications" is how a lot of tests know how to continue after requesting a typecheck which can fail and not provide the correct feedback or gets lost in the logging.

Yes. Inclusive a required short delay even when correct notification already arrived ... just to ensure it's really the latest notification. Over 100s and 1000s of tests this adds up...

I think getting rid of this notification stream would be the best for tests: Difficult to filter & debug, might be out of order, lots of waiting to ensure/hope correct messages, lots of caching of old messages.
Replacing this with direct calls and direct returns would be great -- but then also not the LSP way :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just the initialization and destruction of LSP server that must be sequential. The tests themselves should be able to run in parallel -- but are currently in sequenced. (Though moving tests into a nested testList doesn't seem to run them in parallel/affect performance? -> linked code doesn't seem to be limiting factor :/ )

We're already sharing a server per "logical group" (for example: per CodeFix -- but not for each individual test inside each CodeFix).
Though pushing that further outward might be good for speed. I remember just reusing a document (docTestList) has some noticeable (but not huge) impact. And a single doc should be way more light than a full LSP server.

Yeah it would be something to test. If an LSP server per test is more stable, is it worth the extra 10% in time it takes to do the init/shutdown dance? Hard to say unless we try it out. Although I don't think it's our biggest source of slowness. I think sequenced combined with my findings below is the painful combination.

Yes. Inclusive a required short delay even when correct notification already arrived ... just to ensure it's really the latest notification. Over 100s and 1000s of tests this adds up...

I think getting rid of this notification stream would be the best for tests: Difficult to filter & debug, might be out of order, lots of waiting to ensure/hope correct messages, lots of caching of old messages.
Replacing this with direct calls and direct returns would be great -- but then also not the LSP way :/

Yeah the biggest slowness I think is the |> Observable.bufferSpan (timeout) as we have to wait for that full timeout. It would probably be better to have to pass in your desired criteria and only wait as long as we need. Something like Observable.choose ... |> Observable.timeout. This would probably give another big boost in speeding things up.

We could move to some pull based diagnostics and poll until we get what we need but I fear we might end up not "truly" testing how an LSP is used from a client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth switching to RoslynSourceText by default then eventually removing NamedText

The rest of the discussion is too long, didn't read... but we should do that, and also switch to the Adaptive server, and then remove the old version.

@@ -3392,6 +3378,11 @@ type AdaptiveFSharpLspServer
try
return! codeFix codeActionParams
with e ->
logger.error (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good spot, thanks for adding this!


/// Computes the absolute of `n`
///
/// Unlike `abs` or `Math.Abs` this here handles `MinValue` and does not throw `OverflowException`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a wonderful comment - it answered the questions I had as soon as I read the module name :)

asyncResult {
let filePath = codeActionParams.TextDocument.GetFilePath() |> Utils.normalizePath
let fcsPos = protocolPosToPos codeActionParams.Range.Start
let! (parseAndCheck, lineStr, sourceText) = getParseResultsForFile filePath fcsPos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm thinking about here - could this codefix be authored without the ParseAndCheckResults, instead relying only on the ParseResults? Because that would be a lot less work to compute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked a bit ahead and it's not strictly possible, since the actual fix work requires data on the CheckResults. It may be useful for us to think about this more - ideally we'd be able to very quickly check if certain codefixes are even viable based on parse results as much as possible, then only if they might be viable start using CheckResults.

No changes required for this PR, just something that is on my mind as I review it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically had this same idea recently in F# analyzers, could be solved in similar way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm thinking about here - could this codefix be authored without the ParseAndCheckResults, instead relying only on the ParseResults? Because that would be a lot less work to compute.

Aren't ParseResults and CheckResults calculated and then cached at the same time? So as long as we collect both once we already have them "for free" everywhere else? And because we want diagnostics we already always need both. So ParseAndCheckResults is already collected before each CodeFix gets called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently yes, I believe you're right - I'm thinking more future-facing. I think we will eventually need some more granular data-dependency-based system to make sure we're not doing work unnecessarily.

@baronfel baronfel merged commit f6d5ddc into ionide:main Sep 24, 2023
@Booksbaum Booksbaum deleted the ConvertIntConstants branch September 25, 2023 16:00
nojaf pushed a commit to nojaf/FsAutoComplete that referenced this pull request Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants