Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IOperation representational issues when raising IL #8640

Open
nguerrera opened this issue Feb 12, 2016 · 2 comments
Open

IOperation representational issues when raising IL #8640

nguerrera opened this issue Feb 12, 2016 · 2 comments
Labels
Area-Analyzers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature - IOperation IOperation
Milestone

Comments

@nguerrera
Copy link
Contributor

This is a list of the representational issues that I've encountered with raising IL method bodies to IOperation trees.

I believe that we can add expressiveness to IOperation after the initial release, but I'd like some eyes on this list to help validate that there isn't anything that would require a breaking design-change before v1.

See microsoft/binskim#10 for the (still prototype quality) IL -> IOperation code that triggered these obervations. In particular, ILOperations.Custom.cs shows the nodes that I've invented to cover the gaps below.

cc @JohnHamby @genlu @jaredpar

Operations that are possible in C# but not modeled by IOperation

Bugs are filed for these because there should not (even in v1) be valid C# syntax that cannot be surfaced as IOperation.

Nodes for these would be useful for corresponding IL instructions.

By-ref handling

In IL, managed pointers (type &) are basically just like unmanaged pointers except they are reported to the GC. Passing something by reference is explicitly done by taking its address as a managed pointer and passing that. This is distinct from the IOperation model where the by-ref semantics are captured by how IReferenceExpressions are used (e.g. arguments to IParameterSymbol with IsByRef == true or as a value type IInvocationExpresion.Instance).

For the common cases, we can (and do) raise the IL up to the IOperation model, but there are a few cases that are problematic:

  1. By-ref locals and returns

    This should be solved once Ref returns and Locals #8030 lands. There will be a few amendments to IOperation needed to accommodate it. Discussion on that is in progress on the PR.

    Note that this is not just an issue for when the actual IL has byref locals and returns, but also for when we need to put managed pointer stack values in to generated temporaries to deal with branching with a non-empty stack.

  2. By-ref pointer arithmetic and comparison

    Right now, I've taken the approach of adding explicit conversions to unmanaged pointer types for these cases, but it incorrectly looks as though GC tracking would stop.

Exception handling

Fault blocks

A fault block is effectively a finally block that only runs when an exception is raised. There is no direct C# or VB equivalent.

This is close:

try { 
    ... 
} catch {
    ...
    throw;
}

but gets the semantics as to when the exception is considered unhandled wrong.

This is closer:

    bool succeeded = false;
    try {
        ...
        succeeded = true;
    } finally {
        if (!succeeded) {
            ...
        }
    }

but a succeeded = true would need to be injected at every branch out (leave instruction) in the try block, which would be painful to implement and difficult for analyzers to reason about.

Given that, it would be best IMHO to just model fault blocks directly in IOperation. Adding IBlockStatement ITryStatement.FaultHandler would work. It adds a slight wrinkle of having to define an ordering between fault and finally if both were present on the same try node, but I think it's perfectly intuitive to say that finally is always last. (Nested tries can be used to force a different order.)

Filters

EDIT: This section was written before seeing PR #8624, which changed the filter to an IOperation and blurred expression/statement boundary). I have tentatively represented this as a BlockStatement node with a non-null Type and the implied semantics that the type of the final operation in the block is its value in an expression context. I am still discussing on #8624 to understand if this is a suitable direction in general (reusing XxxStatement nodes as expressions).

ICatch has IExpression Filter, which is sufficient for C# or VB's usage, but an arbitrary filter in IL cannot necessarily be expressed as a single IExpression.

We could resolve this with a new node of the form:

    interface IBlockExpression : IExpression {
        IBlockStatement Block { get; }
        IExpression Expression { get; }
    }

The semantics would be to evaluate Block, then Expression for its result.

Alternatively, we could move the filter body into a lambda and invoke it from the filter:

    Func<Exception, bool> filter = ex => { 
        ...; 
    }
    try {
       ...
    } catch (Exception ex) when(filter(ex)) {
       ...
    }

The latter would capture the semantics and decompile to C#, but convey an expense (closure + delegate allocation, invocation) that does not exist in the IL.

In general, we have to decide if we want to make these sorts of transformations that lose fidelity to the IL. I'd prefer that we didn't. I can see use cases evolving (decompilation) where we'd want to raise the tree up to something C# can produce to the full extent possible, but I think those should be handled with separate, optional passes.

Opcodes that have no direct C#/VB/IOperation equivalents

I've noted possible ways to raise them to higher level constructs, but again I would prefer that there were explicit nodes for them.

"unordered" floating point comparison

There are no BinaryOperationKinds to express the different handling of floating point comparisons vs. NaN values when e.g. cgt.un is used in place of cgt to compare floating point values.

Could possibly add explicit NaN checks instead.

unbox instruction

Unlike the unbox.any variant, unbox puts a by-ref to the value on
the heap on to the stack. It is therefore not just a conversion from
reference type to value type.

Adding interface IUnboxReferenceExpression : IReferenceExpression could
resolve this.

ldftn/ldvirtftn instructions

Although they can be pattern-matched to delegate creation in the common case, we can't represent arbitrary use with the current set of nodes.

calli instruction

I don't see an alternative to having an explict node.

jmp instruction

Could possibly raise as an explicit tail call (but see tail.) below.

cpblk instructions

Could possibly raise as naive loop or Buffer.BlockCopy call.

ckfinite instruction

Could possibly raise as explicit:

    if (Double.IsNan(x) || Double.IsInfinity(x)) 
        throw new ArithmeticException());

initblk instruction

Could possibly raise as naive loop.

break instruction

Could possibly raise as call to Debugger.Break.

ldtoken instruction

For types, We can pattern match to typeof(X) and fall back to typeof(X).TypeHandle, but there is no way to represent loading field and method tokens.

volatile. prefix

Could possibly raise as Volatile.Read/Write calls.

unaligned. prefix

Might be OK to ignore.

tail. prefix

If not represented, analyzers would be given a false impression of stack growth.

no.{typecheck|rangecheck|nullcheck} prefix

Probably OK to ignore. It is only well-defined to use these when the skipped check could not have failed. A CLI implementation is allowed to ignore them.

readonly. prefix

Probably OK to ignore. It would just put limitations on the subsequent use that would not be infringed upon by valid IL.

@ManishJayaswal
Copy link
Contributor

@JohnHamby

@nguerrera
Copy link
Contributor Author

For byref arithmetic and comparison, we can consider raising calls to https://github.com/dotnet/corefx/issues/10451

@jinujoseph jinujoseph modified the milestone: 15.3 Mar 10, 2017
@jinujoseph jinujoseph modified the milestones: Unknown, 15.3 Apr 19, 2017
@jinujoseph jinujoseph added Concept-API This issue involves adding, removing, clarification, or modification of an API. and removed Discussion labels Apr 19, 2017
@jinujoseph jinujoseph modified the milestones: Unknown, 16.0 Jun 23, 2018
@jinujoseph jinujoseph modified the milestones: 16.0, 16.1 Jan 16, 2019
@jinujoseph jinujoseph modified the milestones: 16.1, Backlog Feb 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Analyzers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature - IOperation IOperation
Projects
None yet
Development

No branches or pull requests

5 participants