Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear error, FS3204: If a union type has more than one case and is a struct, then all fields within the union type must be given unique names #3648

Closed
abelbraaksma opened this issue Sep 25, 2017 · 23 comments · Fixed by #14105
Labels
Area-Diagnostics mistakes and possible improvements to diagnostics Feature Improvement
Milestone

Comments

@abelbraaksma
Copy link
Contributor

abelbraaksma commented Sep 25, 2017

FS3204: If a union type has more than one case and is a struct, then all fields within the union type must be given unique names

That error is thrown for a construct like this:

[<Struct>]
type A = B of int | C of int

Could we update that error to say what is actually wrong or somehow make it clearer? I have no idea what is wrong with the code above, as the two cases have unique names B and C.

@thinkbeforecoding
Copy link
Contributor

B and C are the cases, the fields are int and ... int... but you can give them names:

[<Struct>]
type A = B of b: int | C of c: int

So it should solve your problem, but it's true that the error message could be clearer.

@thinkbeforecoding
Copy link
Contributor

I see with a sizeof<A> that there is no field compaction in this case, event if both b and c fields are int, the size is 12 = 3 * sizeof<int>: one int for the case, one int for b and one for c.
In .net, reference fields cannot be merged with other fields, but value fields can.

@thinkbeforecoding
Copy link
Contributor

What would be a better error message to suggest the solution ?

@abelbraaksma
Copy link
Contributor Author

abelbraaksma commented Sep 26, 2017

@thinkbeforecoding, thanks for taking the time to look into this and explaining it. The current error is puzzling (and to be honest, I don't understand why Struct requires field names, I thought these were nothing more than decoration and shouldn't change the semantics).

Note that the documentation is confusing, as it says this:

As of F# 3.1, you can give an individual field a name, but the name is optional, even if other fields in the same case are named.

But given that this error is legitimate (I mean, I'm sure there's some plausible reason for it), I'd suggest something like the following:

Situation one, no field names at all or missing field names

FS3204: Missing field names for cases 'Foo', 'Bar' in struct union type; union types that are a struct and that have more than one case must contain unique field names for each case that wraps a type, as with [<Struct>]type A = B of fieldx: int | C of fieldy: double | D.

Situation two or more non-unique field names

FS3204: Field names 'field1', 'field2' or not unique in struct union. Union types that are a struct and that have more than one must contain unique field names for each case that wraps a type, as with [<Struct>]type A = B of fieldx: int | C of fieldy: double | D.

Situation three, non-unique names and missing names

Both errors are thrown.

If it is not possible to differentiate between these cases, or if it is not possible to enumerate the field names that are non-unique, we could consider a different message. But I think what helps in either case is simply adding an example (some other errors do that too). And by adding "that wraps a type" (or equivalent) we emphasize what we mean with "field names" (which may not be clear on the onset).

@thinkbeforecoding
Copy link
Contributor

I thought these were nothing more than decoration and shouldn't change the semantics

Actually when doing reflection on CLR types generated by a union without field names, the type has properties named Item1, Item2 ...
When you give names to Union type fields, they're named accordingly...
I think this is why...
when using reference Unions, each subclass can have the same properties Item1.. but with as struct, there is a single type in the end, and it cannot have several different properties Item1...

@abelbraaksma
Copy link
Contributor Author

@thinkbeforecoding, I'm not sure I follow your reasoning. Because with that idea, you wouldn't be allowed to have the names Item1 and Item2. I'd assume they must be unique to the type, and Item1 and Item2 are also unique to the type.

Just saying that, when declared as a struct, it should be possible to autogen these fieldnames. But I'm drifting away from my original issue. It is what it is and I'm glad the feature is there (though haven't used it much, I must add). But I do think we should do something about the error.

@kspeakman
Copy link

kspeakman commented Mar 12, 2019

I ran into this today and had to search for the error code. In normal (non-struct) usage of DUs, a dev gets accustomed to the field name on the union type not mattering. And in fact, I don't even think of it as a field. (That's really an underlying implementation detail.) So upon reading the message, I thought "My union cases all have unique names?!".

Anyway, I would suggest the error message mention that the case values (rather than "fields") have to be labeled with unique names. A longer term goal would be for the dev to not have to worry about naming values at all.

@forki
Copy link
Contributor

forki commented Mar 12, 2019 via email

@kspeakman
Copy link

kspeakman commented Mar 12, 2019

The OP has a simple repro:

[<Struct>]
type A = B of int | C of int
// produces error FS3204

If you are wanting context as to why, I'm using it for parsing. I am using simple DUs and then using struct attribute to avoid heap allocations. For example:

[<Struct>]
type StringPhase =
    | InString
    | InEscape
    | InByte1
    | InByte2 of a:int
    | InByte3 of b:int
    | InByte4 of c:int
// Forced to arbitrarily label a, b, c to avoid FS3204.
// They could all reuse the same underlying int field for even better efficiency,
// but the uniqueness constraint prevents that?

Here is the bit that uses the cases with data. These are only the branches that parse the unicode escaped characters within a string. They have the format \u{4 hex digits}.

let rec loop phase sb state =
    let append ch = StringBuilder.appendChar ch sb
    let nState = State.consume 1 state
    // <snip> getting c from state
    match phase, c with
    // <snip> cases to handle other things
    | InString, '\\' ->
        loop InEscape sb nState
    | InEscape, 'u' ->
        loop InByte1 sb nState
    | InByte1, x when isHex x ->
        let code = hexToInt x <<< 12
        // hexToInt 'f' = 0000 0000 0000 1111
        // <<< 12       = 1111 0000 0000 0000
        loop (InByte2 code) sb nState
    | InByte2 code, x when isHex x ->
        let nCode = code + (hexToInt x <<< 8)
        loop (InByte3 nCode) sb nState
    | InByte3 code, x when isHex x ->
        let nCode = code + (hexToInt x <<< 4)
        loop (InByte4 nCode) sb nState
    | InByte4 code, x when isHex x ->
        let chCode = code + hexToInt x
        let ch = System.Convert.ToChar(chCode)
        loop InString (append ch) nState
    | (InByte1 | InByte2 _ | InByte3 _ | InByte4 _), _ ->
        Error InvalidUnicodeSequence

@abelbraaksma
Copy link
Contributor Author

@forki, have a look at my original report above, it's a very simple sample. Basically, any struct union without explicit labels (which are optional according to the docs), throws this confusing error.

It seems reasonable that labels ought to be optional, but that's a feature suggestion. Given that they aren't optional for struct unions (apparently, or I've totally missed the mark), the error should say so. It currently doesn't and confuses programmers new to the 'required optional labels' of struct unions.

A more elaborate example and discussion on the perceived uselessness of this requirement was given by @kspeakman above, which I think makes a case for lifting this error entirely.

@dsyme dsyme added Area-Diagnostics mistakes and possible improvements to diagnostics and removed Area-Compiler labels Mar 31, 2022
@vzarytovskii vzarytovskii moved this to Not Planned in F# Compiler and Tooling Jun 17, 2022
@jimfoye
Copy link

jimfoye commented Sep 14, 2022

Hello from the future (5 years). Bumped into this today and the message was so puzzling, but fortunately this GitHub issue is the top search result on Google.

@voronoipotato
Copy link
Contributor

We just hit this in a demo for our F# meetup today :) .

@T-Gro
Copy link
Member

T-Gro commented Oct 5, 2022

IMO the best return of investment would be to change the name-generation strategy in case of structs - BUT, that would break backwards compatibility if those fields are used outside of F#. Not good.

I am thinking about having different messages depending on using / not using named fields.

My current understanding is that if a person is already using named fields, the message is not bad (?).
But in case of using autogenerated field names only, the message does not help at all (backed by the common understanding of field naming being optional in DUs).

Proposal:

FS3204: "If a union type has more than one case and is a struct, then all fields within the union type must be given unique field names, and not rely on auto-generated field names.
Instead of '[] type A = B of int | C of int' (no field names given), write '[] type A = B of b: int | C of c: int' (unique field names 'b' and 'c' assigned).
"

@T-Gro
Copy link
Member

T-Gro commented Oct 5, 2022

Exactly, should be optimized for a person who did not see it before, and maybe even for a person who might not know that fields in DUs can have optinal names.

What do you think of the message proposal? Or is there a better one?

@edgarfgp
Copy link
Contributor

edgarfgp commented Oct 5, 2022

I think your message proposal is more clear :) . I think we should use the DU field range to report error to make it more clear where the issue is , No the type one that is more ambiguous ?

@T-Gro
Copy link
Member

T-Gro commented Oct 5, 2022

True and a good idea. Issue is in the fields, not in the DU type

@edgarfgp
Copy link
Contributor

edgarfgp commented Oct 5, 2022

Hopefully fsharp/fslang-suggestions#699 gets implemented one day . to remove this limitation :)

@charlesroddie
Copy link
Contributor

charlesroddie commented Oct 13, 2022

Splitting up doesn't help

Situation one, no field names at all or missing field names
Situation two or more non-unique field names
Situation three, non-unique names and missing names

This doesn't clarify the current error message, as each situation is more complicated to understand than the original error message and there are three now not one. Moreover Situation one isn't a true description of what "field" means in the original message, which is indeed too complex, and in the process of simplifying it misrepresents it. There is actually always a field name.

The actual issues

The error message is accurate but references implementation details. Moreover it is sufficiently difficult to understand that the writer of the error message didn't understand that the "has more than one case and" clause is completely unnecessary!

I believe any lack of clarity doesn't appear in the error message examples given in this thread. They could be understood as if "field name" meant "the name (string option) assigned by the programmer" and those examples would make sense. (And they would split up into the more complex situations 1,2,3 if this were the case.)

But in order to understand the error fully you need to look at low level code and the fact that "Item(n+1)" is used as a field name for item number n if none is given. The actual error is that, when field names (string) are assigned to be either the name given, or "Item(n+1)", then two of these strings are identical.

This is very hard to clarify. Two options:

  1. Maintain the low-level references, and point out one of the duplicated field names. "Item1" or "x" etc..
  2. (Breaking) Remove the low-level references by requiring specifying names for each argument explicitly in struct unions, which then become field names. Then the error could avoid using the term "field", for example: "all data items in struct union must be given unique names".

Repository owner moved this from Not Planned to Done in F# Compiler and Tooling Oct 18, 2022
@abelbraaksma
Copy link
Contributor Author

I agree, it’s a tricky message to improve. But it also hints at an implementation detail (like you mentioned) that’s totally superfluous. Just as with normal DUs, the compiler can freely come up with field names. That won’t be a breaking change, as currently, that’s not possible to do.

I thought there was already an issue or proposal for this, but I couldn’t readily find it. Also, typing this on mobile, so browsing is a bit limiting. Either way, the best fix for this confusing error is not having this error at all.

@abelbraaksma
Copy link
Contributor Author

Oh wait, nvm, this has meanwhile been fixed. Awesome 👏 @edgarfgp and @T-Gro. For posterity, if I read it correctly, the new message is a vast improvement:

If a multicase union type is a struct, then all union cases must have unique names. For example: 'type A = B of b: int | C of c: int'.

@voronoipotato
Copy link
Contributor

Honestly I expected it to have come up with field names based on the case name by default unless explicitly overridden with different field names, but with the new error message it is at least intuitive to resolve.

@Xyncgas
Copy link

Xyncgas commented Jan 20, 2024

B of b :b | C of c :c is better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Diagnostics mistakes and possible improvements to diagnostics Feature Improvement
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.