Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First class span's break EFC #109757

Open
Suchiman opened this issue Nov 12, 2024 · 26 comments
Open

First class span's break EFC #109757

Suchiman opened this issue Nov 12, 2024 · 26 comments
Labels
area-System.Linq.Expressions breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. untriaged New issue has not been triaged by the area owner

Comments

@Suchiman
Copy link
Contributor

Description

When enabling LangVersion preview, that enables first class span's which then prefers MemoryExtensions.Contains over Enumerable.Contains which then breaks EFC. It seems that EFC internally calls Expression.Compile with preferInterpretation set to true which then eventually crashes.

Reproduction Steps

public class MyDbContext : DbContext
{
    public DbSet<MyTable> MyTable { get; set; }
}

public class MyTable
{
    public int Id { get; set; }
    public string Name { get; set; }
}

string[] filters = ["Test"];

var dbContext = new MyDbContext();
dbContext.MyTable.Where(x => filters.Contains(x.Name)).ToList();

Expected behavior

First class span's should not negatively affect EFC queries

Actual behavior

TypeLoadException: GenericArguments[1], 'System.Span`1[System.String]', on 'System.Linq.Expressions.Interpreter.FuncCallInstruction`2[T0,TRet]' violates the constraint of type parameter 'TRet'.
   at System.RuntimeType.ValidateGenericArguments(MemberInfo definition, RuntimeType[] genericArguments, Exception e)
   at System.RuntimeType.MakeGenericType(Type[] instantiation)
   at System.Linq.Expressions.Interpreter.CallInstruction.GetHelperType(MethodInfo info, Type[] arrTypes)
   at System.Linq.Expressions.Interpreter.CallInstruction.SlowCreate(MethodInfo info, ParameterInfo[] pis)
   at System.Linq.Expressions.Interpreter.CallInstruction.Create(MethodInfo info, ParameterInfo[] parameters)
   at System.Linq.Expressions.Interpreter.LightCompiler.CompileMethodCallExpression(Expression object, MethodInfo method, IArgumentProvider arguments)
   at System.Linq.Expressions.Interpreter.LightCompiler.Compile(Expression expr)
   at System.Linq.Expressions.Interpreter.LightCompiler.CompileConvertUnaryExpression(Expression expr)
   at System.Linq.Expressions.Interpreter.LightCompiler.Compile(Expression expr)
   at System.Linq.Expressions.Interpreter.LightCompiler.CompileTop(LambdaExpression node)
   at System.Linq.Expressions.Expression`1.Compile(Boolean preferInterpretation)
   at Microsoft.EntityFrameworkCore.Query.Internal.ExpressionTreeFuncletizer.<Evaluate>g__EvaluateCore|70_0(Expression expression, String& parameterName, Boolean& isContextAccessor) in Microsoft.EntityFrameworkCore.Query.Internal\ExpressionTreeFuncletizer.cs:line 1755

Regression?

Yes, only when enabling <LangVersion>preview</LangVersion>

Known Workarounds

set <LangVersion>preview</LangVersion> to anything but preview

Configuration

.NET 9.0.0
EFC 9.0.0

Other information

No response

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 12, 2024
Copy link
Contributor

Tagging subscribers to this area: @cston
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Nov 13, 2024

cc @jjonescz

@ChrisJollyAU
Copy link

@Suchiman Which version of VS are you on? 17.13 Preview 1?

I've just come across this when building the tests on efcore (same problem with MemoryExtesions.* vs IEnumerable.*) To me it only happened on 17.13 preview 1 AND LangVersion is preview. Using 17.12 even with LangVersion on preview it was fine to build

@Suchiman
Copy link
Contributor Author

@Suchiman Which version of VS are you on? 17.13 Preview 1?

yes

@ilGianfri
Copy link

I've attached a simple repro application ConsoleApp2.zip

On Visual Studio 17.12 it works, in 17.13p1 I get this issue with both .NET 8 and .NET 9 if LangVersion is set to preview

@ChrisJollyAU
Copy link

Any relation to #96160 ?

@jjonescz
Copy link
Member

jjonescz commented Dec 9, 2024

This was discussed at LDM with no change in the C# language planned. LINQ-to-DB should be probably updated to recognize MemoryExtensions.Contains (and perhaps other similar methods) or users will need to update the calls (e.g., calling Enumerable.Contains explicitly, or casting the array to IEnumerable<T>).

@GerardSmit
Copy link

Does this mean libraries need to fix this issue?

In that case it would mean .NET 9 (or lower) can never use the new C# LangVersion together with EF Core when First class span's get released?
You will either need to lower the C# LangVersion or upgrade .NET.

I know using newer C# LangVersions in older .NET versions isn't official supported, but I do this together with PolySharp a lot.

When I upgraded my .NET SDK locally, I got hit by this issue; since I had <LangVersion>preview</LangVersion> in my project during the .NET 7 to .NET 8 transition, I forgot I had the preview version still configured 😄.

@jjonescz
Copy link
Member

jjonescz commented Dec 9, 2024

In that case it would mean .NET 9 (or lower) can never use the new C# LangVersion together with EF Core when First class span's get released?

You can use .NET 9, just make sure you call Enumerable methods (e.g., by calling them in their non-extension static forms) instead of MemoryExtensions ones. As you said, older .NETs are not supported with newer LangVersions, so workarounds like that are expected.

@jkotas
Copy link
Member

jkotas commented Dec 9, 2024

This problem is not specific to EF Core. You can hit it with plain vanilla System.Linq.Expressions. It can be fixed in System.Linq.Expressions by adding support for byref-like types to the Interpreter, but that is in conflict with the effectively archived status of System.Linq.Expressions.

@jjonescz This needs a breaking change notice to be filled.

cc @jaredpar

@jkotas jkotas added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Dec 9, 2024
@dotnet-policy-service dotnet-policy-service bot added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Dec 9, 2024
Copy link
Contributor

Added needs-breaking-change-doc-created label because this issue has the breaking-change label.

  1. Create and link to this issue a matching issue in the dotnet/docs repo using the breaking change documentation template, then remove this needs-breaking-change-doc-created label.

Tagging @dotnet/compat for awareness of the breaking change.

@roji
Copy link
Member

roji commented Dec 9, 2024

Just to second @jkotas's comment, our problem in EF is not recognizing MemoryExtensions.Contains and similar (though that in itself is also a breaking change) - it's the reliance on the LINQ interpreter in various scenarios, but the interpreter doesn't support this. More generally, there's a gap that's starting to widen between C# itself and its metaprogramming (and related) features: the interpreter can no longer handle any construct, and LINQ expression trees themselves only represent a small subset of C# (and getting smaller). This will become more and more problematic, unless a decision is made to start updating these components to bring them more in line with the latest C# developments.

@jjonescz
Copy link
Member

It can be fixed in System.Linq.Expressions by adding support for byref-like types to the Interpreter, but that is in conflict with the effectively archived status of System.Linq.Expressions.

That doc says "We will consider changes that address significant bugs or regressions". This seems like a significant regression.

our problem in EF is [...] the reliance on the LINQ interpreter in various scenarios, but the interpreter doesn't support this.

So... does EF need to implement its own interpreter to avoid relying on an archived component?

@roji
Copy link
Member

roji commented Dec 10, 2024

So... does EF need to implement its own interpreter to avoid relying on an archived component?

Well, the LINQ interpreter in .NET is a general component used by other users and packages out there, other than EF - they are all going to be similarly broken by this change; so if someone (regardless of who does the work) adds support to the interpreter, I don't believe that should be part of the EF package. This is probably a conversation best taken offline...

@ChrisJollyAU
Copy link

@jjonescz There's already 2 issues previously related to byref-like types and Span types in expression trees (#96160 and #27499). Both have basically said the same as @jkotas - that System.Linq.Expressions is effectively archived.

With the new first class span support being introduced, it's clear that there would be an increasing likelihood of more expression trees encountering the byref-like types. With this in mind, I feel the impact of that bug has changed enough to at the very least needing to be re-evaluated (and possibly implemented)

@GerardSmit
Copy link

GerardSmit commented Dec 10, 2024

A hacky workaround is to make an ExpressionVisitor that converts all the MemoryExtensions.Contains to Enumerable.Contains.

SharpLab link

Code

The ExpressionVisitor:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;

internal class SpanToEnumerableVisitor : ExpressionVisitor
{
  private static readonly SpanToEnumerableVisitor Instance = new();

  private static readonly MethodInfo ContainsSpan = typeof(MemoryExtensions).GetMethod(
      "Contains",
      BindingFlags.Public | BindingFlags.Static,
      null,
      [typeof(ReadOnlySpan<>).MakeGenericType(Type.MakeGenericMethodParameter(0)), Type.MakeGenericMethodParameter(0)],
      null
  )!;

  private static readonly MethodInfo ContainsEnumerable = typeof(Enumerable).GetMethod(
      "Contains",
      BindingFlags.Public | BindingFlags.Static,
      null,
      [typeof(IEnumerable<>).MakeGenericType(Type.MakeGenericMethodParameter(0)), Type.MakeGenericMethodParameter(0)],
      null
  )!;

  protected override Expression VisitMethodCall(MethodCallExpression node)
  {
      if (node.Method.IsGenericMethod &&
          node.Method.GetGenericMethodDefinition() == ContainsSpan &&
          node.Method.GetGenericArguments() is [var type] &&
          node.Arguments[0] is MethodCallExpression { Method.Name: "op_Implicit" } callExpression &&
          callExpression.Arguments[0].Type == type.MakeArrayType())
      {
          return Expression.Call(
              ContainsEnumerable.MakeGenericMethod(type),
              Expression.Convert(callExpression.Arguments[0], typeof(IEnumerable<>).MakeGenericType(type)),
              node.Arguments[1]
          );
      }

      return base.VisitMethodCall(node);
  }

  public static T Convert<T>(T expression) where T : Expression => (T)Instance.Visit(expression);
}

Example:

string[] filters = ["Test"];
Expression<Func<MyTable, bool>> filter = x => filters.Contains(x.Name);

// Comment out the next line to check the exception
filter = SpanToEnumerableVisitor.Convert(filter);

Func<MyTable, bool> func = filter.Compile(preferInterpretation: true);
bool result = func(new MyTable { Name = "Test" });

Console.WriteLine(result);

public class MyTable
{
    public int Id { get; set; }
    public string Name { get; set; }
}

@jkotas
Copy link
Member

jkotas commented Dec 10, 2024

@jjonescz There's already 2 issues previously related to byref-like types and Span types in expression trees

@roji
Copy link
Member

roji commented Dec 11, 2024

A hacky workaround is to make an ExpressionVisitor that converts all the MemoryExtensions.Contains to Enumerable.Contains.

Yep, that's likely the workaround we'd do in EF for this.

@ChrisJollyAU
Copy link

While .Contains is the method giving the current problem, you would probably have to look at doing it for all methods from MemoryExtensions

@roji
Copy link
Member

roji commented Dec 11, 2024

@ChrisJollyAU indeed.

@ChrisJollyAU
Copy link

@jjonescz @jkotas Was also thinking on this.

EFC calls the .Compile with preferInterpretation being true.

With the function being as follows

public new TDelegate Compile(bool preferInterpretation)
{
    if (CanInterpret && preferInterpretation)
    {
        return (TDelegate)(object)new Interpreter.LightCompiler().CompileTop(this).CreateDelegate();
    }

    return Compile();
}

Clearly the CanInterpret variable is also true, even when the expression has the byref-like types/Span in it. Giving that the interpret fails it clearly can't interpret. That's a potential bug/mismatch there.

Even if the byref-like types wasn't added to the interpreter (if the verdict from the other mentioned issues is still valid now), I would say this part would be applicable - to make sure that the CanConvert is false if you have those types in the expression.

@roji This should satisfy you with regards to performance - those normal expressions will just be interpreted as normal, and then drop into the full compiler (which does work) for the restricted types

@jjonescz
Copy link
Member

This needs a breaking change notice to be filled.

Filed dotnet/docs#43952.

Even if the byref-like types wasn't added to the interpreter

It seems this would also require adding support for ref structs in reflection since the interpreter is using reflection under the hood.

Clearly the CanInterpret variable is also true, even when the expression has the byref-like types/Span in it. Giving that the interpret fails it clearly can't interpret. That's a potential bug/mismatch there.

This sounds like a potentially good solution to me - avoid interpretation if the expression tree contains spans. But is it possible to do compilation everywhere or is it necessary to fall back to interpretation in some cases (e.g., on some platforms)?

@ChrisJollyAU
Copy link

But is it possible to do compilation everywhere or is it necessary to fall back to interpretation in some cases (e.g., on some platforms)?

I think it's just performance reasons that EFC uses the interpreter. The expressions are evaluated only once so was faster with the interpreter than with the compilation and its overheads. @roji Any other reasons for using the interpreter?

@jjonescz jjonescz removed the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Dec 12, 2024
@jkotas
Copy link
Member

jkotas commented Dec 12, 2024

This sounds like a potentially good solution to me - avoid interpretation if the expression tree contains spans. But is it possible to do compilation everywhere or is it necessary to fall back to interpretation in some cases (e.g., on some platforms)?

This does not work for native AOT that EF Core has been busy adding support for.

@jkotas
Copy link
Member

jkotas commented Dec 12, 2024

It seems this would also require adding support for ref structs in reflection since the interpreter is using reflection under the hood.

Yes, adding support for ref structs in reflection has been on the reflection backlog for a while. So far we did not have a good motivating scenario to justify investing into it.

@roji
Copy link
Member

roji commented Dec 12, 2024

@ChrisJollyAU and others, I think the EF strategy for dealing with this will be to perform early detection and replacing of expression nodes that cannot be interpreted, basically transforming the new tree back to the way it was - before the LINQ interpreter is ever invoked. In other words, we'd have a static, hard-coded table in EF's funcletizer (the very first visitor that processes the incoming the tree), and would simply rewrite the tree. So from a pure EF perspective, we'll likely be able to deal with this without any need to either disable interpretation, or something like CanInterpret.

Do you see any trouble with that approach @ChrisJollyAU?

Regardless, this will affect other users of the LINQ interpreter. I'm not sure CanInterpret would be very useful to them, and I'm generally not in love with "magically" switching between interpretation and compilation based on something that the user has no knowledge or control over (i.e. there's a performance cliff the moment you introduce a Contains, or possibly other discrepancies). I'd at least want to discuss this with other users of the interpreter first...

I think it's just performance reasons that EFC uses the interpreter. The expressions are evaluated only once so was faster with the interpreter than with the compilation and its overheads. @roji Any other reasons for using the interpreter?

Yep, that was the reason, nothing else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Linq.Expressions breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

7 participants