The design suggestion Allow empty CE body and return Zero
has been marked "approved in principle."
This RFC covers the detailed proposal for this suggestion.
- Allow empty CE body and return
Zero
- Approved in principle
- Implementation
- Discussion
We add compiler support for using a computation expression with an empty body to represent a computation or data structure's zero, empty, or identity value.
let xs = seq { } // Empty sequence.
let html =
div {
p { "Some content." }
p { } // Empty <p> element.
}
F# computation expressions, whether built into FSharp.Core or user-defined, are a useful tool for declaratively describing computations or the construction and composition of a variety of data structures.
Many well-known computations or data structures for which computation expressions are used have an empty or zero value — for example, a monoid like 'a seq
has an identity element, i.e., the empty sequence.
Computation expressions are also widely used in the creation of domain-specific languages (DSLs) that enable describing complex domains with concise, declarative code. It is common in user interface libraries, for example, to use a computation expression to represent an HTML element or a component in a mobile application UI framework.
Many of these data structures — like an HTML div
or p
element — again have a logical zero or empty value.
It is not currently possible to represent this empty, zero, or identity value in the natural, obvious way.
seq { }
----^^^
stdin(1,5): error FS0789: '{ }' is not a valid expression. Records must include at least one field. Empty sequences are specified by using Seq.empty or an empty list '[]'.
div { }
^^^
stdin(1,1): error FS0003: This value is not a function and cannot be applied.
seq { () }
div { () }
This is in contrast to list and array comprehension expressions, which, though they share many similarities with sequence expressions and regular computation expressions, do allow the representation of the empty or identity value by omitting a body altogether.
[]
[||]
We do not, strictly speaking, change the syntax of computation expressions, as defined in § 6.3.10 of the F# language specification. That is, we do not change the behavior of the parser.
We instead update the typechecker to detect and rewrite builder { }
as builder { () }
, which already typechecks and results in the desired generated code. We do this by detecting the syntactic application of a value to a fieldless SynExpr.Record
and rewriting it during typechecking to the application of a value to a SynExpr.ComputationExpr
with a synthetic unit
body.
A computation expression with an empty body, like
builder { }
is currently parsed and represented in the untyped abstract syntax tree (AST) as an application of a value to a fieldless (recordFields=[]
) SynExpr.Record
:
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.Record (None, None, [], m),
m)
For comparison, a computation expression with a non-empty body, like
builder { () }
is parsed instead as an application of a value to a SynExpr.ComputationExpr
:
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.ComputationExpr (false, SynExpr.Const (SynConst.Unit, m), m),
m)
The application to SynExpr.ComputationExpr
passes typechecking because SynExpr.ComputationExpr
is special-cased and skipped during the propagation of type information from delayed applications.
The application to a fieldless SynExpr.Record
does not pass typechecking because it is not special-cased and thus not skipped. This means that the compiler attempts to typecheck the fieldless record syntax { }
as a record construction expression, which fails, since fieldless records are not allowed.
The parsing of a computation expression with an empty body, like
builder { }
remains unchanged.
We instead update the typechecking logic as follows.
Just as checking of the computation expression body is already skipped for SynExpr.ComputationExpr
during the propagation of type information from delayed applications, we now do the same for SynExpr.Record (None, None, [], _)
, i.e., { }
, when it is the object of an application expression.
Later, when typechecking the application itself, we may now detect the following syntax, representing the original builder { }
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.Record (None, None, [], m),
m)
We then rewrite it to
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.ComputationExpr (false, SynExpr.Const (SynConst.Unit, range0), m),
m)
— equivalent to builder { () }
— before we continue typechecking.
(Note that in the special case when builder
is seq
from FSharp.Core, we additionaly set hasSeqBuilder=true
in the SynExpr.ComputationExpr
case constructor, i.e., SynExpr.ComputationExpr (true, SynExpr.Const (SynConst.Unit, range0), m)
.)
That is, it is now the case that builder { }
≡ builder { () }
. The typechecking of builder { () }
already results in a call to the builder's Zero
method, which is the desired behavior for the new syntax.
We mark the inserted ()
(SynExpr.Const (SynConst.Unit, range0)
) as synthetic by the use of range0
, indicating that the construct is compiler-generated and does not come from the original source code.
We can later use this synthetic marker during the typechecking of the computation expression body to differentiate between a user-supplied and synthetic ()
.
When the computation expression body includes a unit
-typed value, that value is user-supplied (not marked as synthetic), and the builder type has no Zero
method, we continue to emit the existing error diagnostic:
error FS0708: This control construct may only be used if the computation expression builder defines a 'Zero' method
When the builder type has no Zero
method and we instead detect the single synthetic ()
that we have inserted, we can emit a new error message specific to the new syntax:
error FS0708: An empty body may only be used if the computation expression builder defines a 'Zero' method.
There are two main ways in which the following F# syntax could be interpreted:
expr { }
- The application of a computation expression builder value to an empty computation expression body.
- The application of a function to an incomplete record construction expression.
These interpretations are perhaps equally likely a priori from the user's point of view — i.e., as the user enters source code, it is equally as likely that the user has (2) in mind as (1). The code is incomplete under either interpretation according to the current rules, since neither empty-bodied computation expressions nor empty records are allowed.
As mentioned earlier in this document, however, the parser currently treats and will continue to treat this syntax as (2).
By updating the typechecker to rewrite (2) as (1), we are favoring an interpretation at odds with the parser (and, technically, the language specification). This could in theory lead to a confusing error message if the compiler assumes interpretation (1) and the user assumes interpretation (2), or vice versa.
Specifically, the compiler would previously always attempt to typecheck expr
in expr { }
as a function value; if it could not be typechecked as a function value, the compiler, following interpretation (2), would emit:
expr { }
^^^^
stdin(1,1): error FS0003: This value is not a function and cannot be applied.
It did not matter whether expr
was actually a computation expression builder value, or, if it was, which methods the builder type exposed — the compiler never even tried to typecheck it as one.
After this change, the only scenario with differing behavior is when expr
cannot be typechecked as a function value (its type is not a function type or a type variable).
Before, this simply resulted in the message that expr
is not a function and cannot be applied. Now, it will result in compilation as a computation expression if expr
is a computation expression builder value whose type has a Zero
method. If the type of expr
does not have a Zero
method, the new error message indicating that a Zero
method is required for empty-bodied computation expressions will be emitted.
If expr
's type is a function type, or could be a function type (because it is a type variable), there is no change in behavior.
That is, we are now favoring interpretation (1) when expr
is definitely not a function value. Even without the addition of empty-bodied computation expression support, it could be argued that an incomplete computation expression is more likely to be the user's intent in this scenario than the application of a non-function to an incomplete record construction expression.
It seems like this drawback is not actually a drawback in practice.
The original language suggestion for this feature notes that, while the value that a type function like Seq.empty<'T>
produces is generalizable, the value produced by seq { () }
(or any other builder) is not.
let xs = Seq.empty // 'a seq
let ys = xs |> Seq.map ((+) 1) // int seq
let zs = xs |> Seq.map ((+) 1.0) // float seq
let xs = seq { () } // int seq
let ys = xs |> Seq.map ((+) 1) // int seq
let zs = xs |> Seq.map ((+) 1.0) // Doesn't work.
----------------------------^^^
stdin(3,29): error FS0001: The type 'int' does not match the type 'float'
Since we propose in this RFC that we compile seq { }
as seq { () }
, this means that the values produced by empty-bodied computation expressions will also not be generalizable.
It seems reasonable not to make builder { }
generalizable for several reasons:
- Not all computation expressions are generic.
- Generalizability of computation expression values would be a new feature altogether. It seems like it could only apply to computation expressions with empty bodies. Would it also be applied to
builder { () }
? See (4) below. - There is currently no way to make a builder's
Zero
method (whether wrapped inDelay
and/orRun
or not) a generalizable value, even ifZero
is made generic, becauseZero
must be a method, whileGeneralizableValueAttribute
only works on type functions. builder { () }
andbuilder { printfn "This is a side effect." }
are indistinguishable from the type system's perspective. It is surely undesirable to make the latter, side-effecting expression generalizable (although see discussion here and here), and having the resulting value be generalizable or not dependent on the purity of theunit
-valued body would be untenable. This means that we would need to treat an empty body altogether differently from aunit
-valued body. Even then, in order to produce a generalizable value, we would need to devise some mechanism whereby the user could annotate some value as the one to be used in this scenario (cf. the approach taken in C# withCollectionBuilderAttribute
), the compiler could rewrite the original expression to use that instead, and so on. But this would add significant complexity for what seems like little gain.
Update the parser to parse
builder { }
directly as
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.ComputationExpr (false, SynExpr.Const (SynConst.Unit, range0), m),
m)
- No need to make any changes to typechecking.
- The AST no longer represents what the user wrote. That is, the user wrote
builder { }
, notbuilder { () }
, but the distinction between these is no longer possible to represent in the AST. - It may be difficult to foresee all potential effects on and interactions with existing parsing behavior. At a glance, the parsing of expressions involving curly braces
{ … }
is already rather complex and involves multiple layers of non-obvious fallthroughs, etc.
In the untyped abstract syntax tree, update
SynExpr.ComputationExpr of hasSeqBuilder: bool * expr: SynExpr * range: range
to
SynExpr.ComputationExpr of hasSeqBuilder: bool * expr: SynExpr option * range: range
and update the parser to parse
builder { }
as
SynExpr.App
(ExprAtomicFlag.NonAtomic,
false,
SynExpr.Ident (Ident ("builder", m)),
SynExpr.ComputationExpr (false, None, m),
m)
-
The AST actually more closely represents what the user wrote — if their intent was to write the application of a computation expression builder value to an empty computation expression body. (See also cons below.)
-
Minimal changes needed in the typechecker — just treat
SynExpr.ComputationExpr (false, None, m)
the same as
SynExpr.ComputationExpr (false, Some (SynExpr.Const (SynConst.Unit, range0)), m)
- In the absence of a body and additional type information, it is theoretically just as likely that the user is attempting to apply a function to an incomplete record construction expression. In this case, the AST now diverges from the user's intent.
- Both parsing and typechecking must be updated.
- Represents a breaking change to the untyped AST.
- As in the previous alternative, there is risk and complexity in making changes to the parser's treatment of curly-braced expressions.
Please address all necessary compatibility questions:
-
Is this a breaking change?
- No.
-
What happens when previous versions of the F# compiler encounter this design addition as source code?
-
One of two compilation errors is produced — one for
seq
expressions and another for custom builders, includingasync
:seq { } ----^^^ stdin(1,5): error FS0789: '{ }' is not a valid expression. Records must include at least one field. Empty sequences are specified by using Seq.empty or an empty list '[]'.
async { } ^^^^^ stdin(1,1): error FS0003: This value is not a function and cannot be applied.
-
-
What happens when previous versions of the F# compiler encounter this design addition in compiled binaries?
- Older compiler versions will be able to consume the compiled result of this feature without issue.
-
If this is a change or extension to FSharp.Core, what happens when previous versions of the F# compiler encounter this construct?
- N/A.
-
There is an existing compiler diagnostic used for
{ }
andseq { }
, namelyerror FS0789: '{{ }}' is not a valid expression. Records must include at least one field. Empty sequences are specified by using Seq.empty or an empty list '[]'.
This will no longer apply to
seq { }
when targeting newer language versions, but the message must remain the same when using newer versions of the compiler to target older language versions.The message remains applicable to future language versions, since someone may still try to use bare
{ }
to represent an empty sequence.We could add an augmented version that mentioned the now-valid
seq { }
— perhapserror FS0789: '{{ }}' is not a valid expression. Records must include at least one field. Empty sequences are specified by using 'Seq.empty', 'seq { }', or an empty list '[]'.
This does not seem particularly necessary, however.
The compiler should emit an error when a computation expression has an empty body and no intrinsic or extension method member Zero : unit -> M<'T>
on the builder type is in scope.
We reuse the existing error diagnostic ID FS0708, whose current message is:
error FS0708: This control construct may only be used if the computation expression builder defines a '%s' method
Since, however, there is no user-visible "control construct" in the new syntax in such a scenario, we add the following message variant for clarity:
error FS0708: An empty body may only be used if the computation expression builder defines a 'Zero' method.
type Builder () =
member _.Delay f = f
member _.Run f = f ()
let builder = Builder ()
let xs : int seq = builder { }
-------------------^^^^^^^^^^^
stdin(8,20): error FS0708: An empty body may only be used if the computation expression builder defines a 'Zero' method.
We continue to emit the older message when the user does supply a body, e.g.,
type Builder () =
member _.Delay f = f
member _.Run f = f ()
let builder = Builder ()
let xs : int seq = builder { () }
-----------------------------^^
stdin(8,30): error FS0708: This control construct may only be used if the computation expression builder defines a 'Zero' method
Please list the reasonable expectations for tooling for this feature, including any of these:
- Debugging
- Breakpoints/stepping
- N/A.
- Expression evaluator
- N/A.
- Data displays for locals and hover tips
- N/A.
- Breakpoints/stepping
- Auto-complete
- N/A.
- Tooltips
- N/A.
- Navigation and go-to-definition
- N/A.
- Error recovery (wrong, incomplete code)
- N/A.
- Colorization
- N/A.
- Brace/parenthesis matching
- N/A.
- For existing code
- The addition of this feature will not affect the performance of the compiler on existing code.
- For the new feature
- We do not foresee any performance issues with the updated typechecking logic required for this feature.
This feature's effect on compilation time and space complexity scales linearly with the number of instances of empty-bodied computation expressions in source code. There is no limit foreseen to the number of instances of empty-bodied computation expressions in source code that the compiler will accept.
Does the proposed RFC interact with culture-aware formatting and parsing of numbers, dates and currencies? For example, if the RFC includes plaintext outputs, are these outputs specified to be culture-invariant or current-culture.
- No.
None.