Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proposal for "ref and unsafe in iterators and async" #7994

Merged
merged 15 commits into from
Apr 2, 2024
167 changes: 167 additions & 0 deletions proposals/state-machine-restriction-updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# State machine restriction updates

## Summary
[summary]: #summary

- Allow `ref`/`ref struct` locals and `unsafe` blocks in iterators and async methods
provided they are used in code segments without any `yield` or `await`.
- Warn about `yield` inside `lock`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of saying the specific items we'll fix consider having a more general "unify behavior between iterators and async methods" statement

Copy link
Member Author

@jjonescz jjonescz Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add that summarizing sentence, that's a good idea, but I will still leave the specific items as well because it's otherwise hard to tell at glance which specific things change.

- Consolidate spec: disallow pointers to hoisted variables
(pointers to captured variables are already disallowed).

## Motivation
[motivation]: #motivation

It is not necessary to disallow `ref`/`ref struct` locals and `unsafe` blocks in async/iterator methods
if they are not used across `yield` or `await`, because they do not need to be hoisted.

```cs
async void M()
{
await ...;
ref int x = ...; // error previously, proposed to be allowed
x.ToString();
await ...;
// x.ToString(); // still error
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following code:

async Task<int> M()
{
  int x = 42;
  ref int y = ref x;
  y.ToString();
  await Task.Delay();
  y = ref x; // error or okay?
  y.ToString(); // error or okay?

In this case we're reassigning the ref local so it doesn't need to be hoisted. Is this allowed or not?

I would lean towards "do not allow" until a user could provide fairly compelling reasons why it should be. But this part of the proposal is leaning on allowing behavior when the ref doesn't need to be hoisted hence I think we should call out specifically this case in the spec. Essentially when we re-assign after await / yield is that legal?

Copy link
Member Author

@jjonescz jjonescz Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any specific concerns why we should disallow this? We could certainly disallow ref reassignments after await/yield and potentially lift the restriction later, but we are already lifting restrictions here, so I would lean towards allowing this.

Note that this would be allowed by the most straightforward implementation (removing some binding errors but leaving the after-lowering walker which finds usages of ref-like locals across awaits/yields).

Also your example seems equivalent to the following, so we would presumably need to disallow that as well:

int x = 42;
Span<int> y = new(ref x);
y.ToString();
await Task.Yield();
y = new(ref x);
y.ToString();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any specific concerns why we should disallow this?

My concerns are complications of implementation. If this is straight forward to do then I'd be more okay with going this direction. Consider though that there seem to be real challenges with doing this.

// This will be legal in C# 13
ref struct S : IEnumerable<int> { .. } 

async Task M(S s)
{
  foreach (var i in s) {
    await Task.Delay(); 
  }
}

The value s here is definitely used after the await but the current spec language seemingly allows it. This same problem comes up when a ref struct is used in any block level construct: foreach, using, lock, etc ... Is the expectation that for all of these types of constructs that if they involve a ref struct that we error when a await / yield occurs inside them?

Also I'm wondering about the mechanism by which this feature is implemented. The spec says it's an error to use it. Did you consider as an alternative using definite assignment as a mechanism? One approach I was thinking of is essentially the following:

After a statement with a await or yield all ref struct or ref locals in scope are considered definitely unassigned.

That would seemingly provide all the behavior that we want but I haven't really dug into the implementation to see if it's practical.

Copy link
Member Author

@jjonescz jjonescz Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value s here is definitely used after the await but the current spec language seemingly allows it.

It was my intention to disallow this but the spec language should be clarified, you're right. Thanks.

Is the expectation that for all of these types of constructs that if they involve a ref struct that we error when a await / yield occurs inside them?

Yes.

Also I'm wondering about the mechanism by which this feature is implemented. The spec says it's an error to use it. Did you consider as an alternative using definite assignment as a mechanism?

That's exactly what I want to do. There is already a definite assignment walker (IteratorAndAsyncCaptureWalker) which does what you suggest. It only supports ref structs now, so needs some small tweaking to support ref locals as well. Plus I will remove all the binding errors that restrict async with ref/ref struct locals, etc. And that's it. Note that the walker runs after lowering, so it automatically handles foreach/using/lock etc.

I'm not sure if the "definite assignment" mechanism should be part of the spec language as well. But I will try to incorporate it to clarify things.

```

On the other hand, having `yield` inside a `lock` means the caller also holds the lock while iterating which might lead to unexpected behavior.
This is even more problematic in async iterators where the caller can `await` between iterations, but `await` is not allowed in `lock`.
See also https://github.com/dotnet/roslyn/issues/72443.

```cs
lock (this)
{
yield return 1; // warning proposed
}
```

## Detailed design
[design]: #detailed-design

[§13.3.1 Blocks > General][blocks-general]:

> ~~It is a compile-time error for an iterator block to contain an unsafe context ([§23.2][unsafe-contexts]).~~
> An iterator block always defines a safe context, even when its declaration is nested in an unsafe context.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use ```diff blocks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me, but for text/spec changes strikethrough renders more nicely (because other text formatting is still applied). Diff blocks are rendered as plain text, which doesn't seem great for this purpose.


[§13.6.2.4 Ref local variable declarations][ref-local]:

> ~~It is a compile-time error to declare a ref local variable, or a variable of a `ref struct` type,
> within a method declared with the *method_modifier* `async`, or within an iterator ([§15.14][iterators]).~~
> **It is a compile-time error to declare and use a ref local variable, or a variable of a `ref struct` type
> across `await` or `yield` statements.**

[§13.13 The lock statement][lock-statement]:

> [...]
>
> **A warning is reported (as part of the next warning wave) when a `yield` statement
Copy link
Contributor

@AlekseyTs AlekseyTs Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only for legacy lock? I assume it will be an error to yield out of a "modern" lock. #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the modern lock cannot work with yield since it's lowered to using of a ref struct. I will make it clearer.

> ([§13.15][yield-statement]) is used inside the body of a `lock` statement.**

No change in the spec is needed to allow `unsafe` blocks which do not contain `await`s in async methods,
because the spec has never disallowed `unsafe` blocks in async methods.
However, the spec should have always disallowed `await` inside `unsafe` blocks
(it had already disallowed `yield` in `unsafe` in [§13.3.1][blocks-general] as cited above), for example:

[§15.15.1 Async Functions > General][async-funcs-general]:

> It is a compile-time error for the formal parameter list of an async function to specify
> any `in`, `out`, or `ref` parameters, or any parameter of a `ref struct` type.
>
> **It is a compile-time error for an unsafe context ([§23.2][unsafe-contexts]) to contain `await` or `yield`.**

Furthermore, variables inside async or iterator methods should not be "fixed" but rather "moveable"
Copy link
Contributor

@AlekseyTs AlekseyTs Mar 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a breaking change? Is this going to require a user to use fixed depending on whether the local is hoisted or not? #Closed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is a breaking change. No, the intention is to disallow taking pointers of such locals entirely - that's consistent with how variables captured in lambdas and local functions work today. Although allowing to use fixed with those variables (both captured and hoisted) is a possible alternative or future work. I guess there are also other options:

  • Require fixed for all locals in async methods (no need to distinguish hoisted or not).
  • Leave it as is, maybe emit only a warning.

Copy link
Member Author

@jjonescz jjonescz Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered there is already a warning for this in C# 12 warning wave (which is likely why it's not visible in sharplab): dotnet/roslyn#66915

Given that we already decided to go with a warning here, I will remove this part of the feature, only leaving a note in alternatives.

if they need to be hoisted to fields of the state machine (similarly to captured variables).
Note that this is a pre-existing bug in the spec independent of the rest of the proposal
because `unsafe` blocks inside `async` methods were always allowed.

[§23.4 Fixed and moveable variables][fixed-vars]:

> In precise terms, a fixed variable is one of the following:
>
> - A variable resulting from a *simple_name* ([§12.8.4][simple-names]) that refers to a local variable, value parameter, or parameter array,
> unless the variable is captured by an anonymous function ([§12.19.6.2][captured-vars]) **or a local function ([§13.6.4][local-funcs])
> or the variable needs to be hoisted as part of an async ([§15.15][async-funcs]) or an iterator ([§15.14][iterators]) method**.
> - [...]

Note that more constructs can work thanks to `ref` allowed inside segments without `await` and `yield` in async/iterator methods
even though no spec change is needed specifically for them as it all falls out from the aforementioned spec changes:

```cs
using System.Threading.Tasks;

ref struct R
{
public int Current => 0;
public bool MoveNext() => false;
public void Dispose() { }
}
class C
{
public R GetEnumerator() => new R();
async void M()
{
await Task.Yield();
using (new R()) { } // allowed under this proposal
foreach (var x in new C()) { } // allowed under this proposal
foreach (ref int x in new int[0]) { } // allowed under this proposal
lock (new System.Threading.Lock()) { } // allowed under this proposal
await Task.Yield();
}
}
```

## Alternatives
[alternatives]: #alternatives

- `ref`/`ref struct` locals could be allowed only in blocks ([§13.3.1][blocks-general])
which do not contain `await`/`yield`:

```cs
// error always since `x` is declared/used both before and after `await`
{
ref int x = ...;
await Task.Yield();
x.ToString();
}
// allowed as proposed (`x` does not need to be hoisted as it is not used after `await`)
// but alternatively could be an error (`await` in the same block)
{
ref int x = ...;
x.ToString();
await Task.Yield();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a class of ref struct errors that are demoted to warnings inside an unsafe context. Think we should call out explicitly that this is not one of them. Essentially that this will remain and error even in an unsafe context.

These type of values cannot be safely manipulated in unsafe without relying on implementation details of how the state machine rewrite works. That falls outside the boundaries of what we've allowed to demote to warning in unsafe.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safely manipulated in unsafe

:D

```

- It could be possible to use `fixed` to get the address of a hoisted or captured variable
although the fact that those are fields is an implementation detail
so in other implementations it might not be possible to use `fixed` on them.
Note that we only propose to consider also hoisted variables as "moveable",
but captured variables were already "moveable" and `fixed` was not allowed for them.

- We could allow `await`/`yield` inside `unsafe` except inside `fixed` statements (compiler cannot pin variables across method boundaries).
That might result in some unexpected behavior, for example around `stackalloc` as described below.
- We could disallow the unsafe variant of `stackalloc` in async/iterator methods,
because the stack-allocated buffer does not live across `await`/`yield` statements.
It does not feel necessary because unsafe code by design does not prevent "use after free".
Note that we could also allow unsafe `stackalloc` provided it is not used across `await`/`yield`, but
that might be difficult to analyze (the resulting pointer can be passed around in any pointer variable).
Or we could require it being `fixed` in async/iterator methods. That would *discourage* using it across `await`/`yield`
but would not match the semantics of `fixed` because the `stackalloc` expression is not a moveable value.
(Note that it would not be *impossible* to use the `stackalloc` result across `await`/`yield` similarly as
you can save any `fixed` pointer today into another pointer variable and use it outside the `fixed` block.)

[simple-names]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/expressions.md#1284-simple-names
[captured-vars]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/expressions.md#121962-captured-outer-variables
[blocks-general]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/statements.md#1331-general
[ref-local]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/statements.md#13624-ref-local-variable-declarations
[local-funcs]: https://github.com/dotnet/csharpstandard/blob/d11d5a1a752bff9179f8207e86d63d12782c31ff/standard/statements.md#1364-local-function-declarations
[lock-statement]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/statements.md#1313-the-lock-statement
[using-statement]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/statements.md#1314-the-using-statement
[yield-statement]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/statements.md#1315-the-yield-statement
[iterators]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/classes.md#1514-iterators
[async-funcs]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/classes.md#1515-async-functions
[async-funcs-general]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/classes.md#15151-general
[unsafe-contexts]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/unsafe-code.md#232-unsafe-contexts
[fixed-vars]: https://github.com/dotnet/csharpstandard/blob/ee38c3fa94375cdac119c9462b604d3a02a5fcd2/standard/unsafe-code.md#234-fixed-and-moveable-variables