Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Generator Safe Editions(Transformer) #3772

Closed
Jose10go opened this issue Jul 4, 2020 · 14 comments
Closed

Source Generator Safe Editions(Transformer) #3772

Jose10go opened this issue Jul 4, 2020 · 14 comments

Comments

@Jose10go
Copy link

Jose10go commented Jul 4, 2020

The following proposal generalize the concept of source generators that is being implemented to allow safe editions on source code. It enables some scenarios that where posible on the original proposal (the ones with replace and original keywords in the language) without adding the implementation complexity of it.

The current design of source generators is additive only and it may receive what the team call IDE Integration, understood as design time generation, the ones that doesn't receive this "IDE Integration" only runs at build time.

This proposal is about providing an extension point around the following interface to enable additive only mutations.

public interface ITransformer<T> 
    where T:Attribute
{
    BlockSyntax Transform(MethodInformation method,AttributeInformation att);
}

Why allow editions?

AOP and meta-programming (for more info on why allow editions read this blog
from Gael Fraiteur, the creator of PostSharp).

Issues related to Editing

Allow editing source code comes with a lot of headaches, here some of them:

  1. If multiple generators edit the same file, in what order they must run???, what input get each one???
  2. Performance on IDE may be poor since the current proposal states that generators run unordered.
  3. How to offer high fidelity Intellisense support for the modified Types.
  4. (The great one) Allow editions may be a great source for anti-patters and might create a lot of c# dialects destroying the "what you see is what you get" principle.

Solutions

Allow only non-API changes non-API changes mean disallow any changes on definitions. Having this restriction is imposible to modify a method signature, or even a type definition.

Only changes to code blocks are allowed, and in C# this means methods and properties' body. (For simplification only methods' body transformers are explained here)

With non-API changes the interface of a type won't change and that also means that the Intellisense support for that type remains invariant after the edition.
(This solves problem 3).

Issues 1 and 2 are focused on performance to allow a great experience on the IDE. An important thing to notice is that there is not need to run the edit generators in design time thanks to the non-API changes rule and that is, in fact, very good for the IDE experience.

Implementation

Each transformer must use an attribute to mark which method body is transformed. So multiple editions on a method are seen as multiple attributes on the method definition.

Composition order is implicitly defined by the attributes:

[Timer]
[Trace]
public void fibonacci(int n)=>
    n<2?1:fibonacci(n-1)+fibonacci(n-2);

The compose order in the example is first apply Trace and then Timer.

Lets define TraceTransformer for the explanation:

//The attribute needed to use the transformer
public class TraceAttribute: Attribute{}
public class TraceTransformer: ITransformer<TraceAttribute>
{
    public BlockSyntax Transform(MethodInformation original, AttributeInformation attinfo)
    {
        return SyntaxFactory.ParseBlock($@"{{
            Console.WriteLine(""{original.Name}""+"" started..."");
            var result=original({original.Arguments.First().Name});
            Console.WriteLine({original}+"" ended..."");
        }}");    
    }
}

And also TimerTransformer:

//The attribute needed to use the transformer
public class TimerAttribute: Attribute{}
public class TimerTransformer: ITransformer<TimerAttribute>
{
    public BlockSyntax Transform(MethodInformation method,AttributeInformation att)
    {
        return SyntaxFactory.ParseBlock($@"{{
            var crono = new Stopwatch();
            crono.Start();
            var result=original({method.Arguments.First().Name});
            crono.Stop();
            Console.WriteLine({method.Name}+"" runs in ""+crono.ElapsedMilis+"" milliseconds"");
        }}");    
    }
}

The result of the composition operation on the previously defined fibonacci function is:

int fibonacci(int n)
{
    int original(int n)
    {
        int original(int n) => 
            n<2?1:fibonacci(n-1)+fibonacci(n-2);

        {
            Console.WriteLine("fibonacci"+" started...");
            var result = original(n);
            Console.WriteLine("fibonacci"+" ended...");
            return result;
        }
    }
    {
        Stopwatch crono = new Stopwatch();
        crono.Start();
        var result = original(n);
        crono.Stop();
        Console.WriteLine("fibonacci"+" runs in "+crono.ElapsedMilis+" milliseconds.");
        return result;
    }
}

The composition operation consists on defining a local function with the same signature that the method been transformed and with the body of the
previous transformer or in the base case the body of the method been transformed, but with name "original".

Important things to notice:

  1. The composition is not defined in the transformer, so there is not user-code intervention here.
  2. The non-API changes rule allows all the local functions to have the same signature that the method been transformed without issues.
  3. All the nested local functions are named: 'original'. In this way, each transformer only has access to the output of the previous transformer and to the method been transformed.
  4. Since all the nested functions have the same signature, all parameters have the same names and there for only the inner-most are accessible to each function body.
  5. The transformer code is always inserted inside a block after the 'original' nested function, this allows that any user defined variable doesn't exist inside the corresponding original scope. (Is not captured by the closure)
  6. Each transformer gets the same methodInformation instance as input, so they can run unordered or even in parallel, the order is only needed to apply composition.
  7. The composition operator is a constant operation since it only adds the output of the previous transformer as a local definition.

Conclusions

This proposal may comply with most of common arguments against allowing source code editing at build time (source code weaving) in C#. Also it uses previous C# features, without need to add new keywords or constructs. It fits or complements C# 9's proposal additive source generators.

Current proposal no complies with issue 4. Maybe because, there is not answer to solve it... it doesn't allow any transformation that breaks the principle of "what you see is what you get" (even when C# compiler does some magic in several scenarios), then this current proposal of source generators isn't allowed under that rule.

Also, it's important to remark that allowing transformers doesn't change the semantic of the language because it's true that it mutates source code, but it is an additive only mutation, so any error that exists before the transformation process exists also after it runs.

@sighoya
Copy link

sighoya commented Jul 4, 2020

I think this is gonna be a good idea, because invariants still hold after expansion.

A common problem is to insert additional data into a struct for accessing runtime information, but this leads to a breakage of invariants for a struct.
So what about creating extension methods instead and retrieve extension data via some form of compiler reflection?
Is this covered in your proposal, too?

@HaloFour
Copy link
Contributor

HaloFour commented Jul 4, 2020

Seems like an interesting proposal. Using local functions is a novel approach as it seems to eliminate the need for language changes, but I think it introduces additional issues. For example, the scope of the aspects would leak into the original function. I think that can be resolved by wrapping the body of the aspect in its own code block, e.g.:

int fibonacci(int n)
{
    int original(int n)
    {
        int original(int n) => 
            n<2?1:fibonacci(n-1)+fibonacci(n-2);

        {
            Console.WriteLine("fibonacci"+" started...");
            var result = original(n);
            Console.WriteLine("fibonacci"+" ended...");
            return result;
        }
    }
    {
        Stopwatch crono = new Stopwatch();
        crono.Start();
        var result = original(n);
        crono.Stop();
        Console.WriteLine("fibonacci"+" runs in "+crono.ElapsedMilis+" milliseconds.");
        return result;
    }
}

There's also the issue that if the code is injected into the same source file that you have no control over the imported namespaces so you'd have to qualify every type/function used.

Then there's the tooling and debugging experience. The compiler needs to be able to associate the original code back to the lines in the source file. Plus there should be a way to see the decorated code and step into it, which this expansion doesn't really lend itself to.

Then there's the issue of ordering. Running them in the order of the attributes applied to the method definitely makes sense but I don't think that's how the compiler invokes analyzers or generators today. I think there's some design work that would need to be done to track the transformers within each generator applied to the solution so that they could be executed in any order and repeatedly.

I guess this precludes aspects applied to classes?

@sighoya
Copy link

sighoya commented Jul 4, 2020

@HaloFour

Oh Sh?t, you are right. The order is a big problem here.

I think this is only partially solvable because of the non deterministic order of libraries linked to this application.

Maybe, general code insertions should be handled by hygienic macros only and outside code generation only by annotations.

@Jose10go
Copy link
Author

Jose10go commented Jul 9, 2020

The scope of the aspects would leak into the original function.

If with leaked you mean captured by the closure, that's not the case because the scope of the aspect is inserted after the original function.(It's the 5th thing to note in the proposal...)

There's also the issue that if the code is injected into the same source file that you have no control over the imported namespaces so you'd have to qualify every type/function used.

I think this is a minor drawback ... but thanks for point it

Then there's the tooling and debugging experience. The compiler needs to be able to associate the original code back to the lines in the source file. Plus there should be a way to see the decorated code and step into it, which this expansion doesn't really lend itself to.

This is solvable and it seems to me like an issue to solve when the proposal is in a more advanced state (remember this is just a design)

Then there's the issue of ordering. Running them in the order of the attributes applied to the method definitely makes sense but I don't think that's how the compiler invokes analyzers or generators today. I think there's some design work that would need to be done to track the transformers within each generator applied to the solution so that they could be executed in any order and repeatedly.

You should read the proposal again, transformers runs unordered and all receive the same input, the order is just used in the compose operation....

@HaloFour
Copy link
Contributor

HaloFour commented Jul 9, 2020

@Jose10go

If with leaked you mean captured by the closure, that's not the case because the scope of the aspect is inserted after the original function.(It's the 5th thing to note in the proposal...)

The locals are still captured, they just can't be used because they're not yet assigned, except for local functions which would be invocable due to hoisting. Adding a block around the aspect neatly hides all of that away.

I think this is a minor drawback ... but thanks for point it

I think it's a pretty big deal. The first thing that every generator needs to do is negotiate the source file of the method it happens to be written in to determine what names are in scope and what aren't. This is a problem neatly avoided by original/replace which relied on partial types declared in completely independent source files.

This is solvable and it seems to me like an issue to solve when the proposal is in a more advanced state (remember this is just a design)

I'll be interested to see how you intend to tackle that in a prototype. Tooling is a big consideration. This is also a case where I think the separation of original/replace makes for a cleaner tooling experience, there's a clear and clean separation between the aspect and the original code.

transformers runs unordered and all receive the same input, the order is just used in the compose operation....

This is what I'm interested in because, to my understanding, this is very different to how analyzers/generators work now. Rhetorically, is it easier to try to weave the execution of the generators vs. trying to weave the generated code?

@Jose10go
Copy link
Author

Jose10go commented Jul 9, 2020

The locals are still captured, they just can't be used because they're not yet assigned, except for local functions which would be invocable due to hoisting. Adding a block around the aspect neatly hides all of that away.

You are absolutely right about this, i didn't catch it the first time you point it. 👍👍👍 .I am going to modify the proposal to
enclose the aspect code inside a block.

I think this is a minor drawback ... but thanks for point it

I think it's a pretty big deal. The first thing that every generator needs to do is negotiate the source file of the method it happens to be written in to determine what names are in scope and what aren't.

At least how i think about transformers (maybe it is not clear in the proposal) they work at syntactic level, so binding or any other semantic information is out. When a user writes a transformer, he can only depend on the original function the composition operation is going to put in scope. The transformer has to be agnostic about anything else because anything is guaranteed to be defined in the scope.

This is solvable and it seems to me like an issue to solve when the proposal is in a more advanced state (remember this is just a design)

I'll be interested to see how you intend to tackle that in a prototype. Tooling is a big consideration. This is also a case where I think the separation of original/replace makes for a cleaner tooling experience, there's a clear and clean separation between the aspect and the original code.

I have not a plan about it yet, so any idea is welcome, but the source generators being implemented for c# 9 may receive or not IDE support depending on the methods the generator implements(So i think not allowing IDE support for debugging isn't out the table).
Maybe read only generated in memory files to see transformations and add breakpoints is a good idea.

@sighoya
Copy link

sighoya commented Jul 9, 2020

At least how i think about transformers (maybe it is not clear in the proposal) they work at syntactic level, so binding or any other semantic information is out. When a user writes a transformer, he can only depend on the original function the composition operation is going to put in scope. The transformer has to be agnostic about anything else because anything is guaranteed to be defined in the scope.

What is, if the transformer is written in a static library outside your project. The user may can expand the insertions in an ordered way.
But then, the library is dynamically linked into your project, pasting some different content inside the annotee causing a change in semantic.

Further, if the transformer expansion in the static library was safe in execution but the transformer expansion of the dynamic library throws an exception, then the user has hard time to investigate the reason of an early return when he didn't see what was generated.

This is the same problem with Java's runtime reflection causing nondeterministic jumps in code when debugging over that code.

@aacombarro89
Copy link

aacombarro89 commented Jul 10, 2020

Hey guys thanks for the feedback on the proposal. We wanted exactly that, some discussions about things we may not be aware of. From my reading, in the proposal we presented, we still need to think on:

  1. How the aspect code should be expanded with the proper dependencies without causing any issues in the original file
  2. Tooling/Debugging experience (as general and unexplored territory for this proposal)

We'll iterate over this topics or additional you present to us. Please feel free to continue sharing ideas/concerns. Maybe we'll start by having a deep look into the original/replace proposal.

@Jose10go Jose10go reopened this Jul 25, 2020
@gafter gafter transferred this issue from dotnet/roslyn Aug 4, 2020
@aka-nse
Copy link

aka-nse commented Aug 17, 2020

Please confirm some points.

  • In my understand, .Net specifies nothing for order of attributes. For this feature, how the compiler recognize the order? If it is covered only in C#, there is no information about order in output binary?

  • How does it operate for inherited method: especially same generator attribute with base method is given?

@Jose10go
Copy link
Author

In my understand, .Net specifies nothing for order of attributes. For this feature, how the compiler recognize the order? If it is covered only in C#, there is no information about order in output binary?

Remember attributes are just metadata, these are interpreted by other programs (like the .NET Runtime, the compiler, the IDE....). In this case these attributes are interpreted by a transformer in the compiler pipeline and processed in the given order. So it doesn't matter the order in the output assembly because the runtime is not going to use them.

How does it operate for inherited method: especially same generator attribute with base method is given?

There is nothing different in this case, the transformation to the base method is handled when it is compiled and there is not any special handling at runtime. The code produced in the output assembly already contains the generated code so it is inherited like any other method.

@jfontsaballs
Copy link

jfontsaballs commented Nov 14, 2020

I would like to suggest another way to perform the transformation which I believe would solve the previously stated problems with debugging and (possibly) namespaces.

Original file does not change:

public class MyClass
{
    [Timer]
    [Trace]
    public int fibonacci(int n) => n<2?1:fibonacci(n-1)+fibonacci(n-2);
}

Transformers could stay the same:

//The attribute needed to use the transformer
public class TraceAttribute: Attribute{}
public class TraceTransformer: ITransformer<TraceAttribute>
{
    public BlockSyntax Transform(MethodInformation original, AttributeInformation attinfo)
    {
        return SyntaxFactory.ParseBlock($@"{{
            Console.WriteLine(""{original.Name}""+"" started..."");
            var result=original({original.Arguments.First().Name});
            Console.WriteLine({original}+"" ended..."");
        }}");    
    }
}
//The attribute needed to use the transformer
public class TimerAttribute: Attribute{}
public class TimerTransformer: ITransformer<TimerAttribute>
{
    public BlockSyntax Transform(MethodInformation method,AttributeInformation att)
    {
        return SyntaxFactory.ParseBlock($@"{{
            var crono = new Stopwatch();
            crono.Start();
            var result=original({method.Arguments.First().Name});
            crono.Stop();
            Console.WriteLine({method.Name}+"" runs in ""+crono.ElapsedMilis+"" milliseconds"");
        }}");    
    }
}

The original file is modified to call functions that contain the generated code, instead of inserting the generated code directly into the original file. Those functions are passed the arguments of the original function, a reference to this and a delegate to call its original implementation (see code examples). Those functions' implementations go to new generated source files, which allows the existing tooling for source generation to be used. Thus, there's no need to debug the modified code on the original file and using statements in the original file do not affect the generated code.

The original file after being modifies by the compiler:

public class MyClass
{
    int fibonacci(int n)
    {
        int original(int n)
        {
            int original(int n) => n<2?1:fibonacci(n-1)+fibonacci(n-2);
            TimerTransformerNamespace.TimerTransformerGeneratedClass.TimerTransformerGeneratedMethod(this, n, original);
        }
        TraceTransformerNamespace.TraceTransformerGeneratedClass.TraceTransformerGeneratedMethod(this, n, original);
    }
}

A new file is generated for each transformation, similarly to what is done when a partial class is added new methods using source generation:

namespace TraceTransformerNamespace
{
    public static class TraceTransformerGeneratedClass 
    {
        public static int TraceTransformerGeneratedMethod(MyClass @this, int n, Func<int, int> original)
        {
            Console.WriteLine("fibonacci"+" started...");
            var result = original(n);
            Console.WriteLine("fibonacci"+" ended...");
            return result;
        }
    }
}
namespace TimerTransformerNamespace
{
    public static class TimerTransformerGeneratedClass 
    {
        public static int TimerTransformerGeneratedMethod(MyClass @this, int n, Func<int, int> original)
        {
            Stopwatch crono = new Stopwatch();
            crono.Start();
            var result = original(n);
            crono.Stop();
            Console.WriteLine("fibonacci"+" runs in "+crono.ElapsedMilis+" milliseconds.");
            return result;
        }
    }
}

Maybe, the generation method could be made to return more than one BlockSyntax to allow generated code to be placed at the top of the file (for using statements) and inside of the static class but outside of the static method, although this is not strictly necessary but only a convenience to the writers of the source generator.

@jfontsaballs
Copy link

jfontsaballs commented Nov 14, 2020

Being able to edit methods and properties' implementations, would reduce the need for partial properties, as proposed in #3412. However, it would be very important that auto properties' implementation can be edited as well (typical example: add OnPropertyChanged call to properties).

@jfontsaballs
Copy link

Another option would be to generate the methods proposed in this comment inside the same class and a different file, having that the class marked partial. This would allow the generated code to access private fields inside the same class, which would not be possible if the methods are placed in a different class.

@etkramer
Copy link

What's the status on this?

@jnm2 jnm2 converted this issue into discussion #6266 Jul 11, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants