F# RFC FS-1053 - voidptr, IsReadOnly structs, inref, outref, ByRefLike structs, byref extension members
.NET has a new feature Span
. This RFC adds support for this in F#.
- Approved in principle suggestion
- Implementation: Complete
- Discussion: #287
The main aim of the RFC is to allow effective consumption of APIs using Span
, Memory
and ref-returns.
Span is actually built from several features and a library:
- The
voidptr
type - The
NativePtr.ofVoidPtr
andNativePtr.toVoidPtr
functions - The
inref
andoutref
types via capabilities on byref pointers ByRefLike
structs (includingSpan
andReadOnlySpan
)IsReadOnly
structs- implicit dereference of byref and inref-returns from methods
byref
extension members
C# | F# |
---|---|
out int arg |
arg: byref<int> |
out int arg |
arg: outref<int> |
in int arg |
arg: inref<int> |
ref readonly int (return) |
normally inferred, can use : inref<int> |
ref expr |
&expr |
- C# 7.2 feature "ref structs"
- C# 7.2 "read only structs"
- One key part of the F# compiler code for ref structs is here
- C# 7.1 "readonly references"
- C# 7.2: Compile time enforcement of safety for ref-like types.
- Issue dotnet/fsharp#4576 deals with oddities in the warnings about struct mutation.
The main aim of the RFC is to allow effective consumption of APIs using Span
, Memory
, and ref-returns. This helps all of:
- Parity with .NET Core 2.1 and C#
- Proper interoperability with current and future .NET/C# APIs which use
Span
andMemory
heavily - Better code generation
- Better performance for F# when used to produce code that is performance-critical.
There are several elements to the design.
The voidptr
type represents void*
in F# code, i.e. an untyped unmanaged pointer.
This type should only be used when writing F# code that interoperates
with native code. Use of this type in F# code may result in
unverifiable code being generated. Conversions to and from the
nativeint
or other native pointer type may be required.
Values of this type can be generated
by the functions in the NativeInterop.NativePtr
module.
We add these:
//namespace FSharp.Core
module NativePtr =
val toVoidPtr : address:nativeptr<'T> -> voidptr
val ofVoidPtr : voidptr -> nativeptr<'T>
We add this type:
//namespace FSharp.Core
type byref<'T, 'Kind>
The following capabilities are defined:
//namespace FSharp.Core
module ByRefKinds =
/// Read capability
type In
/// Write capability
type Out
/// Read and write capabilities
type InOut
For example, the type byref<int, In>
represents a byref pointer with read capabilities.
The following abbreviations are defined:
//namespace FSharp.Core
type byref<'T> = byref<'T, ByRefKinds.InOut>
type inref<'T> = byref<'T, ByRefKinds.In>
type outref<'T> = byref<'T, ByRefKinds.Out>
Here byref<'T>
is the existing F# byref type.
A special constraint solving rule allows ByRefKinds.InOut :> ByRefKinds.In
and ByRefKinds.InOut :> ByRefKinds.Out
in subsumption,
e.g. most particularly at method and function argument application. This
means, for example, that a byref<int>
can be passed where an inref<int>
is expected (dropping the Out
capability) and
a byref<int>
can be passed where an outref<int>
is expected (dropping the In
capability).
Capabilities assigned to pointers get solved in the usual way by type inference. Inference variables for capabilities
are introduced each point the operators &expr
and NativePtr.toByRef
are used. In both cases their listed return
type of byref<T>
is expanded to byref<T, ?Kind>
for a new type inference variable ?Kind
.
NOTE: In F# 4.5, the byref<'T, 'Kind>
type and the tags ByRefKinds.InOut
, ByRefKinds.In
, ByRefKinds.Out
are not available for direct use, and are only usable in FSharp.Core.dll.
The type inref<'T>
is defined as byref<'T, ByRefKinds.In>
to indicate a read-only byref, e.g. one that acts as an input parameter. It's primary use is to pass structures more efficiently, without copying and without allowing mutation. For example:
let f (x: inref<System.DateTime>) = x.Days
Semantically inref<'T>
means "the holder of the byref pointer may only use it to read". It doesn't imply that other threads or aliases don't have write access to that pointer. And inref<SomeStruct>
doesn't imply that the struct is immutable. However inref<SomeStruct>
does imply that any pointer acquired to struct fields nested within SomeStruct
are in turn
given type inref<_>
. Together with other rules this means inref<'T>
also implies "the holder of the byref pointer may not modify the immediate contents of the memory pointed to".
By the type inference rules above, a byref<'T>
may be used where an inref<'T>
is expected.
For methods and properties on F# value types, the F# value type this
parameter
is given type inref<'T>
if the value type is considered immutable (has no mutable fields and no mutable sub-structs).
The type outref<'T>
is defined as byref<'T, ByRefKinds.Out>
to indicate a write-only byref that can act as an output parameter.
let f (x:outref<System.DateTime>) = x <- System.DateTime.Now
Semantically outref
means "the holder of the byref pointer may only use it to write". It doesn't imply that other threads or aliases don't have read access to that pointer.
By the type inference rules above, a byref<'T>
may be used where an outref<'T>
is expected.
The following examples show code that is not permitted:
module WriteToInRef =
let f1 (x: inref<int>) = x <- 1 // not allowed
module InRefToByRef =
let f1 (x: byref<'T>) = 1
let f2 (x: inref<'T>) = f1 &x // not allowed
module InRefToOutRef =
let f1 (x: outref<'T>) = 1
let f2 (x: inref<'T>) = f1 &x // not allowed
module InRefToByRefClassMethod =
type C() =
static member f1 (x: byref<'T>) = 1
let f2 (x: inref<'T>) = C.f1 &x // not allowed
module InRefToOutRefClassMethod =
type C() =
static member f1 (x: outref<'T>) = 1 // not allowed
let f2 (x: inref<'T>) = C.f1 &x
Similarly, a write to an inner struct field is not permitted:
[<Struct>]
type S =
[<DefaultValue(true)>]
val mutable X : int
module WriteToInRefStructInner =
let f1 (x: inref<S>) = x.X <- 1 //not allowed
Similarly, taking the address of an inner struct field via an inref<_>
pointer gives an inref<_>
pointer type:
[<Struct>]
type S =
[<DefaultValue(true)>]
val mutable X : int
module WriteToInRefStructInner =
let f1 (x: inref<S>) =
let addr = &x.X
addr <- 1 //not allowed
However, a similar set of rules do not apply to outref<'T>
. This is for compatibility reasons, and the outref<'T>
type is considered essentially equivalent to byref<'T>
: you can both read and write from an outref<_>
type, and no specific checks are performed to ensure definite assignment to outref<_>
. Further, instances of [<Out>]
or [out]
attributes occurring in .NET metadata or from provided types are interpreted as the type byref<'T>
. This means the outref<_>
is really for code documentation purposes only. The reason for this is compatibility: method signatures using [out]
are already being implemented by byref
types. This means outref<'T>
types are never introduced implicitly by F#.
For compatibility, .NET parameters using the [in]
attribute (e.g. [in] int32& p
) are interpreted as type byref<'T>
(e.g. byref<int32>
). This is for similar reasons to the above: .NET method signatures already exist that may use [in]
. This means the only place where inref<'T>
types are introduced implicitly by F# is
- For a .NET parameter or return type that has an
IsReadOnlyAttribute
modreq
, see below. - For the
this
pointer on a struct type that has no mutable fields, see below if a .NET signature also has amodreq
attribute forIsReadOnly
on a parameter, see below. - For the address of a memory location derived from another
inref<_>
pointer.
F# 4.1 added return byrefs. We adjust these so that they implicitly dereference when a call such as span.[i]
is used, which calls span.get_Item(i)
method on Span
. To avoid the implicit dereference you must write &span.[i]
.
For example:
let SafeSum(bytes: Span<byte>) =
let mutable sum = 0
for i in 0 .. bytes.Length - 1 do
sum <- sum + int bytes.[i]
sum
This applies to module-defined functions returning byrefs as well, e.g.
let mutable x = 1
let f () = &x
let test() =
let addr : byref<int> = &f()
addr <- addr + 1
However it specifically doesn't apply to:
-
the
&
operator itself -
NativePtr.toByRef
which is an existing library function returning a byref
As noted above, in these cases the returned type byref<T>
is expanded to byref<T, ?Kind>
for a new type inference variable ?Kind
.
When iterating over a collection using a for..in
expression that returns a byref as its item, that item will be implicitly dereferenced.
For example:
let test (s: Span<int>) =
for x in s do
()
x
is of type int
because the item returned from Span is implicitly dereferenced at that point.
Direct assignment to returned byrefs is permitted, e.g.
let mutable v = System.DateTime.Now
let f() = &v
f() <- f().AddDays(2.0)
Likewise for byrefs returned by methods, e.g.
type C() =
let mutable v = System.DateTime.Now
member __.InstanceM() = &v
let F1() =
let today = System.DateTime.Now.Date
let c = C()
c.InstanceM() <- today.AddDays(2.0)
And likewise for properties, e.g.
type C() =
let mutable v = System.DateTime.Now
member __.InstanceProperty = &v
let F1() =
let today = System.DateTime.Now.Date
let c = C()
c.InstanceProperty <- today.AddDays(2.0)
The general idea is that a let-bound value cannot have its reference escape the scope from which it was declared, effectively its visibility. For example:
let test () =
let x =
let y = 1
&y
()
&y
will now receive an error describing that it's escaping its local scope.
Another example:
let test () =
let z = 1
let x =
let y = 1
if true then
&y
else
&z
()
&y
will also give the same error.
The rationale behind this is not just for soundness reasons, but for guarantees in the optimizer that local values like this will not escape in such a manner.
F# structs are normally readonly, it is quite hard to write a mutable struct. Knowing a struct is readonly gives more efficient code and fewer warnings.
Example code:
[<IsReadOnly; Struct>]
type S(count1: int, count2: int) =
member x.Count1 = count1
member x.Count2 = count2
IsReadOnly
is not added to F# struct types automatically, you must add it manually.
Note that [<IsReadOnly>]
does not imply [<Struct>]
. Both attributes have to be specified.
Using IsReadOnly
attribute on a struct which has a mutable field will give an error.
"ByRefLike" structs are stack-bound types with rules like byref<_>
and Span<_>
.
They declare struct types that are never allocated on the heap. These are
useful for high-performance programming as you get a set of strong checks
about the lifetimes and non-capture of these values. They are also potentially useful for correctness
in some situations where capture must be avoided.
- Can be used as function parameters, method parameters, local variables, method returns
- Cannot be static or instance members of a class or normal struct
- Cannot be captured by any closure construct (async methods or lambda expressions)
- Cannot be used as a generic parameter
- In F# 9 and higher, this restriction is relaxed if the generic parameter is defined in C# using the
allows ref struct
anti-constraint. F# can instantiate such generics in types and methods with byref-like types. As a few examples, this affects BCL delegate types (Action<>
,Func<>
), interfaces (IEnumerable<>
,IComparable<>
) and generic arguments with a user-provided accumulator function (String.string Create<TState>(int length, TState state, SpanAction<char, TState> action)
) - It is impossible to author generic code supporting byref-like types in F#.
- In F# 9 and higher, this restriction is relaxed if the generic parameter is defined in C# using the
Here is an example:
open System.Runtime.CompilerServices
[<IsByRefLike; Struct>]
type S(count1: int, count2: int) =
member x.Count1 = count1
member x.Count2 = count2
ByRefLike structs can have IsByRefLike members, e.g.
open System.Runtime.CompilerServices
[<IsByRefLike; Struct>]
type S(count1: Span<int>, count2: Span<int>) =
member x.Count1 = count1
member x.Count2 = count2
Note that [<IsByRefLike>]
does not imply [<Struct>]
both attributes have to be specified.
The F# approach to stackalloc
has always been to make it an "unsafe library function" whose use generates a "here be dragons" warning. The C# team make it part of the language and are able to do some additional checks.
In theory it would be possible to mirror those checks in F#. However, the C# team are considering further rule changes around stackalloc
in any case. Thus it seems ok (or at least consistent) if we follow the existing approach for F# and don’t
add any specific knowledge of stackalloc to the rules.
C# attaches an Obsolete
attribute to the Span
and Memory
types in order to give errors in down level compilers seeing these types, and presumably has special code to ignore it. We add a corresponding special case in the compiler to ignore the Obsolete
attribute on ByRefLike
structs.
The F# compiler doesn't emit these attributes when defining ByRefLike
types. Authoring these types in F# for consumption by down-level C# consumers will be extremely rare (if it ever happens at all). Down-level consumption by F# consumers will also never happen and if it does the consumer will discover extremely quickly that the later edition of F# is required.
-
A C#
ref
return value is given typeoutref<'T>
-
A C#
ref readonly
return value is given typeinref<'T>
(i.e. readonly) -
A C#
in
parameter becomes ainref<'T>
-
A C#
out
parameter becomes aoutref<'T>
-
Using
inref<T>
in argument position results in the automatic emit of an[In]
attribute on the argument -
Using
inref<T>
in return position results in the automatic emit of anmodreq
attribute on the return item -
Using
inref<T>
in an abstract slot signature or implementation results in the automatic emit of anmodreq
attribute on an argument or return -
Using
outref<T>
in argument position results in the automatic emit of an[Out]
attribute on the argument
When an implicit address is being taken for an inref
parameter, an overload with an argument of type SomeType
is preferred to an overload with an argument of type inref<SomeType>
. For example give this:
type C() =
static member M(x: System.DateTime) = x.AddDays(1.0)
static member M(x: inref<System.DateTime>) = x.AddDays(2.0)
static member M2(x: System.DateTime, y: int) = x.AddDays(1.0)
static member M2(x: inref<System.DateTime>, y: int) = x.AddDays(2.0)
let res = System.DateTime.Now
let v = C.M(res)
let v2 = C.M2(res, 4)
In both cases the overload resolves to the method taking System.DateTime
rather than the one taking inref<System.DateTime>
.
"byref" extension methods allow extension methods to modify the struct that is passed in. Here is an example of a C#-style byref extension member in F#:
open System.Runtime.CompilerServices
[<Extension>]
type Ext =
[<Extension>]
static member ExtDateTime2(dt: inref<DateTime>, x:int) = dt.AddDays(double x)
Here is an example of using the extension member:
let dt2 = DateTime.Now.ExtDateTime2(3)
The this
parameter on struct members is now inref<StructType>
when the struct type has no mutable fields or sub-structures.
This makes it easier to write performant struct code which doesn't copy values.
At return expressions (i.e. the bodies of lambdas, methods and functions), checks are made that the expression being returned has a lifetime longer than the method activation or function activation. The checks are a cut-down version of those for C#. Specifically, checks are only made for "rvalues" and the existing checks for "lvalues" (byrefs) are not adjusted in this RFC.
- An argument value is safe-to-return
- Most composite expressions are safe-to-return if all its parts are safe-to-return
let SafeSum (bytes: Span<byte>) =
let mutable sum = 0
for i in 0 .. bytes.Length - 1 do
sum <- sum + int bytes.[i]
sum
let TestSafeSum() =
// managed memory
let arrayMemory = Array.zeroCreate<byte>(100)
let arraySpan = new Span<byte>(arrayMemory);
SafeSum(arraySpan)|> printfn "res = %d"
// native memory
let nativeMemory = Marshal.AllocHGlobal(100);
let nativeSpan = new Span<byte>(nativeMemory.ToPointer(), 100);
SafeSum(nativeSpan)|> printfn "res = %d"
Marshal.FreeHGlobal(nativeMemory);
// stack memory
let mem = NativePtr.stackalloc<byte>(100)
let mem2 = mem |> NativePtr.toVoidPtr
let stackSpan = Span<byte>(mem2, 100)
SafeSum(stackSpan) |> printfn "res = %d"
The additions are not backwards compatible for two reasons:
-
The implementation of the RFC includes a bug fix to the design of ref-returns. Functions, methods and properties returning byrefs are now implicitly de-referenced. An explicit use of an address-of operator such as
&f(x)
or&C.M()
is needed to access the value as a byref pointer.An error message is used when a type annotation indicates that an implicit dereference of a ref-return is now being used.
-
"Evil struct replacement" now gives an error. e.g.
[<Struct>]
type S(x: int, y: int) =
member __.X = x
member __.Y = y
member this.Replace(s: S) = this <- s
Note that the struct is immutable, except for a Replace
method. The this
parameter will now be considered inref<S>
and
an error will be reported suggesting to add a mutable field if the struct is to be mutated.
Allowing this assignment was never intended in the F# design and I consider this as fixing a bug in the F# compiler now we have the machinery to express read-only references.
-
The addition of pointer capabilities slightly increases the perceived complexity of the language and core library.
-
The addition of
IsByRefLike
types increases the perceived complexity of the language and core library.
-
Make
inref
,byref
andoutref
completely separate types rather than algebraically related--> The type inference rules become much harder to specify and much more "special case". The current rules just need a couple of extra equations added to the inference engine.
-
Generalize the subsumption "InOut --> In" and "InOut --> Out" to be a more general feature of F# type inference for tagged/measure/... types.
--> Thinking this over
-
No implicit dereference on byref return from let-bound functions. Originally we did not do implicit-dereference when calling arbitrary let-bound functions. However usability feedback indicated this is a source of inconsistency
-
Implicit address-of for
inref<'T>
arguments on member methods. While this is part of the corresponding C# design, it has a specific scoping issue:
let x = &M(1)
The AST from that is translated into this:
let x =
let tmp = 1
&M(&tmp)
Because M
returns a byref
, we have to assume its scope will escape on return because it was passed a byref
from a local defined within the scope of the call. tmp
is narrow in scope.
One option to resolve this issue is to force explicit address-of for arguments on member methods that return a byref
. Everything else can be implicit.
--> Will try to bring this in for F# 5.0. We might start seeing APIs use inref<'T>
more that warrant this feature coming back.
- No implicit dereference of byrefs in
for..in
expressions when using&
syntax.
For example:
let test (s: Span<int>) =
for &x in s do
x <- 5
x
is now of type byref<int>
because it is not implicitly dereferenced due to using &
. Because &x
is a pattern, this might be quite tricky to get right.
- Implicit address-of is not applied for parameters to let-bound functions. For example:
let testIn (m: inref<System.DateTime>) =
()
let callTestIn() =
testIn System.DateTime.Now
The F# language does no conversions apart from subsumption at calls to let-bound functions (e.g. no F# lambda --> System.Func conversion). While the programmer may initially expect the same conversions to take place, it is better to maintain consistency and not add this one adhoc type-directed auto-convert. If we were to make a change here, it would be better to make a systematic change here that took into account all conversions.
-
No new or additional checks are made that
outref<_>
parameters are written to. F# has already had[<Out>]
parameters and their rarity means there's not a lot of value in adding the control-flow-based checks to implement this. -
F# structs are not always readonly/immutable.
- You can declare
val mutable
. The compiler knows about that and uses it to infer that the struct is readonly/immutable. - Mutable fields can be hidden by signature files
- Structs from .NET libraries are mostly assumed to be immutable, so defensive copies aren't taken
- You can do wholesale replacement of the contents of the struct through
this <- v
in a member (see above)
- You can declare
-
Note that F# doesn't compile
type DateTime with
member x.Foo() = ...
as a "byref this" extension member where x
is a readonly reference. Instead you have to define a C#-style byref extension method. This means a copy happens on invocation. If it did writing performant extension members for F# struct types would be very easy. Perhaps we should allow something like this
type DateTime with
member (x : inref<T>).Foo() = ...
or
type DateTime with
[<SomeNewCallByRefAttribute>]
member x.Foo() = ...
This makes it impossible to define byref extension properties in particular.
- Note that
IsByRefLikeAttribute
is only available in .NET 4.7.2.
namespace System.Runtime.CompilerServices
open System
open System.Runtime.CompilerServices
open System.Runtime.InteropServices
[<AttributeUsage(AttributeTargets.All,AllowMultiple=false); Sealed>]
type IsReadOnlyAttribute() =
inherit System.Attribute()
[<AttributeUsage(AttributeTargets.All,AllowMultiple=false); Sealed>]
type IsByRefLikeAttribute() =
inherit System.Attribute()
None