Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing C# null semantics in the new pipeline #15745

Merged
merged 1 commit into from
May 20, 2019
Merged

Conversation

maumar
Copy link
Contributor

@maumar maumar commented May 17, 2019

Currently we always apply "full" null semantics translation, meaning null values are completely removed from the tree. We could produce slightly simpler translation where it doesn't matter if we return NULL or FALSE (e.g. in predicates)
Also, currently when testing if a given subtree is null we do it in naive way, simply adding IsNull call around entire subtree. Instead we can test it's constituents, but to do it properly we need to persist nullability information.

@maumar maumar requested review from smitpatel and roji May 17, 2019 01:21
// !(false) -> true
if (unaryExpression.NodeType == ExpressionType.Not
&& unaryExpression.Operand is ConstantExpression innerConstant
&& innerConstant.Type == typeof(bool))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

innerConstant.Value is bool value
avoids the cast below

}

var newElseResult = (SqlExpression)Visit(caseExpression.ElseResult);
_isNullable |= isNullable;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certain expressions like ElseResult could be null too. Is value of _isNullable correct in such cases? Are we not ending up with stale value if we are not resetting the value before each visit.

@maumar
Copy link
Contributor Author

maumar commented May 18, 2019

@smitpatel new version up

Expression.NotEqual(newArgument, _constantNullString),
methodCallExpression.Update(newObject, new[] { newArgument })));

if (newArgument is ConstantExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use ternary

Expression.NotEqual(newArgument, _constantNullString),
Expression.Not(innerMethodCall.Update(newObject, new[] { newArgument }))));

if (newArgument is ConstantExpression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ternary

@@ -2,7 +2,7 @@
<Import Project="..\Directory.Build.props" />

<PropertyGroup>
<StandardTestTfms>netcoreapp3.0</StandardTestTfms>
<StandardTestTfms>net461;netcoreapp3.0</StandardTestTfms>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are deleting this file in Brice's PR. Should revert this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left it by accident


Assert.Equal(405, result.Count);
var result = query.ToList();
Assert.Equal(162, result.Count);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the correct result here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

debatable, it's for relational null semantics so we cant use L2O as baseline. 162 is assuming null == null doesn't match, 405 is where they match. Seems like 162 should be the expected result.

}
}

public static TResult Maybe<TResult>(object caller, Func<TResult> expression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protected

// !(false) -> true
if (outerUnary.OperatorType == ExpressionType.Not
&& outerUnary.Operand is SqlConstantExpression innerConstantBool
&& innerConstantBool.Type == typeof(bool))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value is bool pattern match

// NULL IS NULL -> true
if (outerUnary.OperatorType == ExpressionType.Equal
&& outerUnary.Operand is SqlConstantExpression innerConstantNull1
&& innerConstantNull1.Value == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If value is not null then you can always return false. Likewise for block below. Basically, you can reduce constant IS NULL check to true/false by looking at the value.

if (innerUnary.OperatorType == ExpressionType.Equal)
{
//!(a IS NULL) -> a IS NOT NULL
return Visit(_sqlExpressionFactory.IsNotNull(innerUnary.Operand));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Visit only innerUnary.Operand and then create the expression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would potentially be missing some optimizations

//!(a IS NOT NULL) -> a IS NULL
if (innerUnary.OperatorType == ExpressionType.NotEqual)
{
return Visit(_sqlExpressionFactory.IsNull(innerUnary.Operand));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above & other blocks below too

// true || a -> true
// false && a -> false
// false || a -> a
if (newLeftConstant != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove if above and use else if


if (sqlBinaryExpression.OperatorType == ExpressionType.Equal)
{
if (!leftNullable && !rightNullable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better represented by nested ifs with 1 condition at time

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally for me I think the current grouping is more readable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The we need to be explicit about the flags in block 3 & 4. It implicitly assumes value of the other flag which is not straight forward to understand.

// 1 | 0 | 0 | 1 | 0 |
// 1 | 1 | 1 | 1 | 1 |
// 1 | N | N | 0 | 0 |
private SqlBinaryExpression ExpandNonNullableEqualNullable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just invert the arguments in ExpandNullableEqualNonNullable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was thinking about it, but didn't want to switch the order of terms in generated sql. I guess its not a big deal tho - will DRY

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick jetlagged review from the airport

switch (extensionExpression)
{
case SqlConstantExpression sqlConstantExpression:
return ProcessSqlConstantExpression(sqlConstantExpression);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just call these Visit* as elsewhere?


if (sqlBinaryExpression.OperatorType == ExpressionType.Equal)
{
if (!leftNullable && !rightNullable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally for me I think the current grouping is more readable

{
private readonly ISqlExpressionFactory _sqlExpressionFactory;

private readonly Dictionary<ExpressionType, ExpressionType> _expressionTypesNegationMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C# 8 switch expression FTW (more terse, more efficient):

Suggested change
private readonly Dictionary<ExpressionType, ExpressionType> _expressionTypesNegationMap
private static ExpressionType Negate(ExpressionType expressionType)
=> expressionType switch {
ExpressionType.Equal => ExpressionType.NotEqual,
ExpressionType.NotEqual => ExpressionType.Equal,
ExpressionType.GreaterThan => ExpressionType.LessThanOrEqual,
ExpressionType.GreaterThanOrEqual => ExpressionType.LessThan,
ExpressionType.LessThan => ExpressionType.GreaterThanOrEqual,
ExpressionType.LessThanOrEqual => ExpressionType.GreaterThan
};

If you need to check whether negation is possible, we can have TryNegate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one would be more performant? Readonly dictionary should give O(1) access. Methodcall may need to check through all cases (rejected ones) before reaching correct one.

Copy link
Member

@roji roji May 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A switch over an enum is a switch over its underlying type - so over an int here, which in most cases will do an efficient lookup table under the hood. This is in general much faster than any dictionary - see benchmark below. Also no allocations whatsoever for the dictionary and its internals.

Note that the difference here specifically may be negligible in the grander scheme - especially since this is query compilation only - but a switch expression also seems more terse so why not.

And because it's so easy to do, here's a benchmark comparison:

// * Summary *

BenchmarkDotNet=v0.11.5, OS=ubuntu 19.04
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview4-011223
[Host] : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
DefaultJob : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT

Method Mean Error StdDev Median
SwitchExpression 2.553 ns 0.0884 ns 0.1150 ns 2.533 ns
Dictionary 7.055 ns 0.1993 ns 0.4775 ns 6.813 ns
public class SwitchEnum
{
    [Benchmark]
    public ExpressionType SwitchExpression()
        => SwitchMethod(ExpressionType.GreaterThanOrEqual);

    [Benchmark]
    public ExpressionType Dictionary()
        => _dictionary[ExpressionType.GreaterThanOrEqual];

    static readonly Dictionary<ExpressionType, ExpressionType> _dictionary
        = new Dictionary<ExpressionType, ExpressionType>
    {
        { ExpressionType.Equal, ExpressionType.NotEqual },
        { ExpressionType.NotEqual, ExpressionType.Equal },
        { ExpressionType.GreaterThan, ExpressionType.LessThanOrEqual },
        { ExpressionType.GreaterThanOrEqual, ExpressionType.LessThan },
        { ExpressionType.LessThan, ExpressionType.GreaterThanOrEqual },
        { ExpressionType.LessThanOrEqual, ExpressionType.GreaterThan },
    };

    static ExpressionType SwitchMethod(ExpressionType expressionType)
        => expressionType switch {
            ExpressionType.Equal => ExpressionType.NotEqual,
            ExpressionType.NotEqual => ExpressionType.Equal,
            ExpressionType.GreaterThan => ExpressionType.LessThanOrEqual,
            ExpressionType.GreaterThanOrEqual => ExpressionType.LessThan,
            ExpressionType.LessThan => ExpressionType.GreaterThanOrEqual,
            ExpressionType.LessThanOrEqual => ExpressionType.GreaterThan,
            _ => throw new ArgumentException(nameof(expressionType))
        };
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If switch is faster let's use it. Is it specific for Enum case only of any type? I would guess not all pattern matching is as cheap as enums over int.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the results for string keys, switch wins as well:

Method Mean Error StdDev
SwitchExpression 13.29 ns 0.2945 ns 0.4317 ns
Dictionary 23.10 ns 0.4976 ns 1.0387 ns

In general, for small datasets switch should (almost) always be faster than a dictionary - when in doubt we can always do a quick benchmark to find out.

}
}

if (extensionExpression is SqlBinaryExpression outerBinary)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method already deals with a lot of cases, maybe split to one method for (outer) unary and a separate one for (outer) binary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why I didnt split them is so I can easily fall back to base.VisitExtension call. But I guess I can just return null in the VisitUnary/VisitBinary for unoptimized cases and test for that in the VisitExtension

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a proposal, no strong feelings about it... :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its a good suggestion, especially when/if we add more optimizations, the single method will become unreadable

{
public class NegationOptimizingVisitor : ExpressionVisitor
{
private readonly Dictionary<ExpressionType, ExpressionType> _expressionTypesNegationMap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, C# 8 switch expression

@maumar
Copy link
Contributor Author

maumar commented May 20, 2019

new version up @smitpatel @roji

// however if we decide to do "partial" null semantics (that doesn't distinguish between NULL and FALSE, e.g. for predicates)
// we need to be extra careful here
var reverse = ReverseExpressionType(innerBinary.OperatorType);
if (reverse != innerBinary.OperatorType)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems a bit awkward. Any better way to replace ContainsKey() check? @roji

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking to revert it back to dictionary for now just to get the feature in - will refactor into switch in future checkin if we come up with a nice pattern for the ContainsKey thing

Copy link
Member

@roji roji May 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your force-push made some commits disappear, can't see this exact code.

In any case, I'd have a TryReverse() which returns bool (like Dictionary.TryGetValue()), upon which you could implement Reverse() (which throws if not found, like Dictionary.this[]).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, had a weird unrefreshed browser version where this PR wasn't merged yet)

return newArgument is ConstantExpression
? result
: Expression.OrElse(
Expression.Equal(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indent

Expression.Not(innerBinary.Right)));
}

var reverse = ReverseExpressionType(innerBinary.NodeType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return null and use null check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or move back to dictionary


In reply to: 285759777 [](ancestors = 285759777)

Copy link
Contributor Author

@maumar maumar May 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then I need to use .Value everywhere - seems even more awkward

@@ -8,6 +8,7 @@
<RootNamespace>Microsoft.EntityFrameworkCore</RootNamespace>
<GenerateDocumentationFile>true</GenerateDocumentationFile>
<CodeAnalysisRuleSet>..\..\EFCore.ruleset</CodeAnalysisRuleSet>
<LangVersion>8.0</LangVersion>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was added by VS when i used c# 8.0 "switch" - will remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're OK with switch expressions, that's new to C# 8, as opposed to plain old verbose switch statements. Although for now I don't see that change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry, had a weird unrefreshed browser version where this PR wasn't merged yet)

@@ -49,7 +49,7 @@ var results
Assert.Equal(
@"SELECT [m].[Int]
FROM [MappedNullableDataTypes] AS [m]
WHERE [m].[TimeSpanAsTime] = '00:01:02'",
WHERE ([m].[TimeSpanAsTime] = '00:01:02') AND [m].[TimeSpanAsTime] IS NOT NULL",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we improve this?

// non_nullablee_constant IS NOT NULL -> true
if (sqlUnaryExpression.OperatorType == ExpressionType.NotEqual
&& sqlUnaryExpression.Operand is SqlConstantExpression innerConstantNull2
&& innerConstantNull2.Value == null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...

return sqlBinaryExpression.Update(newLeft, newRight);
}

return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just do visit here rather than calling base in initial method?

Copy link
Contributor

@smitpatel smitpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Currently we always apply "full" null semantics translation, meaning null values are completely removed from the tree. We could produce slightly simpler translation where it doesn't matter if we return NULL or FALSE (e.g. in predicates)
Also, currently when testing if a given subtree is null we do it in naive way, simply adding IsNull call around entire subtree. Instead we can test it's constituents, but to do it properly we need to persist nullability information.
@maumar maumar merged commit dbfa82a into master May 20, 2019
@ghost ghost deleted the null_semantix branch May 20, 2019 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants