Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put generated files to disk: #47047

Merged
merged 12 commits into from
Sep 25, 2020
Merged

Conversation

chsienki
Copy link
Contributor

@chsienki chsienki commented Aug 21, 2020

  • Add new command line param 'GeneratedFilesOut'
  • Route new param through tasks with defaults
  • Write out generated files when requested to do so
  • Rationalize post generation to avoid uncesseary work
  • Add Tests for targets and emit

Fixes #47211

@chsienki chsienki added the PR For Personal Review Only The PR doesn’t require anyone other than the developer to review it. label Aug 21, 2020
@chsienki chsienki added Area-Compilers Feature - Source Generators Source Generators and removed PR For Personal Review Only The PR doesn’t require anyone other than the developer to review it. labels Aug 22, 2020
@chsienki chsienki marked this pull request as ready for review August 22, 2020 00:50
@chsienki chsienki requested a review from a team as a code owner August 22, 2020 00:50
@chsienki
Copy link
Contributor Author

@dotnet/roslyn-compiler for review please

@jmarolf
Copy link
Contributor

jmarolf commented Aug 22, 2020

:(

@chsienki
Copy link
Contributor Author

@jmarolf Why the sad face?

@jaredpar
Copy link
Member

@jmarolf if you have constructive feedback about this PR I'd be interested in hearing about it. A simple frowny face though doesn't really move the conversation forward, it just reduces the momentum of PRs and causes confusion.

This is a long planned feature for source generators backed by customer scenarios. Specifically customers want the ability to see the actual generated source on disk and the compiler is providing that ability.

@jmarolf
Copy link
Contributor

jmarolf commented Aug 25, 2020

Specifically customers want the ability to see the actual generated source on disk and the compiler is providing that ability.

Sorry, there isn't a link to a bug in this PR and the description isn't sufficient to understand where the customer scenario is coming form.

My sad face should be interpreted as "this is unfortunate but I am not going to block this PR if there are good reasons™ we need to do it"

I have seen almost all source generator authors try and mess with file IO. the fact that we do not generate files has led a lot of them to come to forums and ask why and we explain "your source generator should not be doing file IO". Our current design appears (anecdotally) to guide the user into the pit of success of allowing file operations to be tracked by MSBuild. Once this is checked in source generator authors will check for the existence of generated files, most likely do IO to read them, and then decide if they need to regenerate. This is the pattern most authors first try before running into a wall.

Without any details about why this is being done I can only speculate but if this is about either

  • being able to have source-based artifacts available in the debugger or some artifacts service
  • being able to provide better debug-ability for source generators

I would like us to take a different approach. Neither of those two cases screams that we need this functionality and I would prefer our current approach of not making file IO an easy thing for source generator authors to do.

@chsienki
Copy link
Contributor Author

@jmarolf There isn't an issue, but it was always part of the design to (optionally) persist generated files to disk: https://github.com/dotnet/roslyn/blob/master/docs/features/source-generators.md#output-files

Note that we don't use them for debugging, or build correctness (we consulted with the MSBuild team too while designing this). Yes it potentially opens up an avenue for authors to attempt to 'optimize' their generator by not generating any source, but there's nothing stopping them from doing that themselves anyway, so I don't really think we're opening up a pit of failure with this change.

Personally, I wish we'd never write anything at all (hence why the design is you can disable it, and we work entirely off on in-memory representations), but we heard from multiple customers during the design phase that they wanted this feature; we've deliberately waited to implement it later on to see if it was actually needed, and it's been one of the most vocal pieces of feedback we've heard from customers (both authors and consumers).
.
Note also that this is a pure compiler-level feature: there isn't an API we're exposing to do this, and there are no indications to the generator itself that it will happen; it's purely a debugging aid that customers can use to better understand what's happening in the compiler.

@jaredpar
Copy link
Member

@jmarolf

I have seen almost all source generator authors try and mess with file IO

I don't see how that is related to this change at all though. This is about the compilation process choosing to emit extra artifacts that we control to disk. Not morally different than the XML doc file, PDB, etc ... It doesn't have any relation to SG authors trying to read / write files in their code.

bool hasAnalyzerConfigs = !Arguments.AnalyzerConfigPaths.IsEmpty;
bool hasGeneratedOutputPath = !string.IsNullOrWhiteSpace(Arguments.GeneratedFilesDirectory);

var generatedSyntaxTrees = compilation.SyntaxTrees.Skip(Arguments.SourceFiles.Length).ToImmutableArray();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ToImmutableArray vs. say ToList? This doesn't ever appear to be used as an ImmutableArray<T> hence seems like this is just adding some extra overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed the length; I am a cargo cult programmer so I used ImmutableArray because we use it everywhere else instead of lists.

More seriously, is there a perf preference between list and regular array? My intuition here is to use .ToArray() as we only need the length and to iterate over it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On framework ToList is generally faster than ToArray due to having one less allocation. On .NET Core it's somewhat the opposite because they use better builders (got schooled on that fix on twitter recently). Would probably go with ToList for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Segmented arrays would be the best case scenario here. I've completed the implementation of SegmentedArray<T> but I'm still working through the details to get SegmentedList<T> in place.

@jmarolf
Copy link
Contributor

jmarolf commented Aug 25, 2020

I don't see how that is related to this change at all though

Its only related in that before source generators were "stateless". You could no know as a source generator if you had run previously. It was not possible with out pretty unusual hacks. Now it will be possible for source generator authors to pass this flag in and use previous build artifacts to know if they've run. It's a very minor point (thus the very minor :( face).

@@ -615,6 +616,17 @@ public new CSharpCommandLineArguments Parse(IEnumerable<string> args, string? ba
}
continue;

case "generatedfilesout":
Copy link
Member

@jcouv jcouv Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The help string (for csc --help) should be updated to document new option. #Resolved

@@ -615,6 +616,17 @@ public new CSharpCommandLineArguments Parse(IEnumerable<string> args, string? ba
}
continue;

case "generatedfilesout":
if (RoslynString.IsNullOrWhiteSpace(value))
Copy link
Member

@jcouv jcouv Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RoslynString [](start = 32, length = 12)

nit: I think string.IsNullOrWhiteSpace would do (RoslynString was there to add nullability annotations, but shouldn't be necessary now) #Resolved

case "generatedfilesout":
if (RoslynString.IsNullOrWhiteSpace(value))
{
AddDiagnostic(diagnostics, ErrorCode.ERR_SwitchNeedsString, MessageID.IDS_Text.Localize(), arg);
Copy link
Member

@jcouv jcouv Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERR_SwitchNeedsString [](start = 69, length = 21)

I couldn't find ERR_SwitchNeedsString in the added sections of the test files. Please add a test, if we don't have one. #Resolved

@@ -1,4 +1,5 @@
*REMOVED*Microsoft.CodeAnalysis.Compilation.CreateFunctionPointerTypeSymbol(Microsoft.CodeAnalysis.ITypeSymbol returnType, Microsoft.CodeAnalysis.RefKind returnRefKind, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.ITypeSymbol> parameterTypes, System.Collections.Immutable.ImmutableArray<Microsoft.CodeAnalysis.RefKind> parameterRefKinds) -> Microsoft.CodeAnalysis.IFunctionPointerTypeSymbol
Microsoft.CodeAnalysis.CommandLineArguments.GeneratedFilesDirectory.get -> string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GeneratedFilesDirectory [](start = 44, length = 23)

nit: there is a small inconsistency in naming between GeneratedFilesOutputPath and GeneratedFilesDirectory. Is that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have OutputPath and OutputDirectory, so I just made it match the same pattern. No objection to unifying it though.


In reply to: 478710831 [](ancestors = 478710831)

}
}

embeddedTexts = embeddedTexts.AddRange(embeddedTextBuilder.ToImmutable());
Copy link
Member

@jcouv jcouv Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToImmutable [](start = 83, length = 11)

Do we need to materialize array here? I think the AddRange(IEnumerable<T>) overload should work. #Resolved

using var writer = new StreamWriter(fileStream);

sourceText.Write(writer, cancellationToken);
touchedFilesLogger?.AddWritten(path);
Copy link
Member

@jcouv jcouv Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have test coverage for the touched files logic? #Resolved

var fileStream = OpenFile(path, diagnostics, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite | FileShare.Delete);
if (fileStream is object)
{
using var disposer = new NoThrowStreamDisposer(fileStream, path, diagnostics, MessageProvider);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disposer [](start = 46, length = 8)

nit: "ignoredDisposer"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's not really ignored, as it gets disposed?


In reply to: 478716972 [](ancestors = 478716972)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine. I was thrown off, trying to find where it was used. If we supported discards in usings, we'd use a discard ;-)

}
foreach (var tree in generatedSyntaxTrees)
{
Debug.Assert(!string.IsNullOrWhiteSpace(tree.FilePath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just curious, how do we know that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we create them ourselves in the GeneratorDriver. If they don't have a path then something weird is going on.


In reply to: 478717222 [](ancestors = 478717222)

Copy link
Member

@jcouv jcouv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with review pass (iteration 2)

@CyrusNajmabadi
Copy link
Member

Now it will be possible for source generator authors to pass this flag in and use previous build artifacts to know if they've run. It's a very minor point (thus the very minor :( face).

That definitely concerns me. It will be abused. Could we emit to a random location (and provide a flag to have us report what that random location is)? That way the user can go and look at teh source. But they could never depend on it to store state from run to run?

Copy link
Member

@CyrusNajmabadi CyrusNajmabadi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a comment

@jaredpar
Copy link
Member

jaredpar commented Aug 28, 2020

That definitely concerns me. It will be abused. Could we emit to a random location (and provide a flag to have us report what that random location is)? That way the user can go and look at teh source. But they could never depend on it to store state from run to run

The specific design, as expressed by our customers, is to have an option to emit files to disk for inspection. This is agreed on by the people involved.

I am a bit surprised at all of the push back here. There are a variety of options that can be abused by generator authors if they wish. This is just another one of them. The abuse here is no different than any other unlisted IO in the build tasks. That violates the build completely and is possible today without any flag.

@CyrusNajmabadi

If your "request changes" review is because you disagree with the design then I'd ask that you remove that. This is the design, it's been this way from the start and there has been ample opporutunity to push back on it. At this point we are in the end game of the product cycle and we are attempting to deliver the existing design. We can have a separate convo about the design but we'd like to move forward with the existing design so we can deliver on our RC1 commitments.

If you have feedback on the implementation more than happy to hear it and react here.

@CyrusNajmabadi CyrusNajmabadi self-requested a review September 24, 2020 17:44
@RikkiGibson RikkiGibson self-assigned this Sep 24, 2020
Copy link
Contributor

@RikkiGibson RikkiGibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM--had a couple comments about test code, but do not need to be addressed given the timeline.

@@ -2476,6 +2476,24 @@ private static void ValidateEmbeddedSources_Windows(Dictionary<string, string> e
}
}

private static void ValidateWrittenSources(Dictionary<string, Dictionary<string, string>> expectedFilesMap, Encoding encoding = null)
{
foreach ((var dirPath, var fileMap) in expectedFilesMap.ToArray())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what effect does ToArray have here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copies it, so we don't modify the collection as we're iterating it :)


// Clean up temp files
CleanupAllGeneratedFiles(src.Path);
Directory.Delete(dir.Path, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is the CleanupAllGeneratedFiles call redundant given this call? (and in subsequent usages)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, actually, we don't need it, i'll make a note to clean that up later on. Thanks

@chsienki chsienki changed the base branch from master to release/dev16.8 September 24, 2020 20:26
Copy link
Member

@jasonmalinowski jasonmalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits, but no major concerns now that the encoding bug is fixed. And I did review both compiler and IDE side of things.

}
else
{
generatedFilesOutputDirectory = ParseGenericPathToFile(value, diagnostics, baseDirectory);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pass the switch more than once, will that just ignore everyone but the last?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the same behavior as /out: etc.

@@ -29,12 +29,12 @@
</trans-unit>
<trans-unit id="ERR_AddressOfMethodGroupInExpressionTree">
<source>'&amp;' on method groups cannot be used in expression trees</source>
<target state="new">'&amp;' on method groups cannot be used in expression trees</target>
<target state="translated">&amp; pro skupiny metod se nedá použít ve stromech výrazů.</target>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changing like this in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think this is a mis-rebase due to the merge I had from master. Good catch, will clean up with a new rebase.

var dir = instance.GetPropertyValue("CompilerGeneratedFilesOutputPath");

Assert.Equal("false", emit);
Assert.Equal("", dir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string.Empty in earlier tests but not here?

bool hasAnalyzerConfigs = !Arguments.AnalyzerConfigPaths.IsEmpty;
bool hasGeneratedOutputPath = !string.IsNullOrWhiteSpace(Arguments.GeneratedFilesOutputDirectory);

var generatedSyntaxTrees = compilation.SyntaxTrees.Skip(Arguments.SourceFiles.Length).ToList();
Copy link
Member

@jasonmalinowski jasonmalinowski Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no surprises here about other things adding trees? Nothing funky on the VB side of things around My since I think that implicitly adds trees, but probably not to the arguments compilation's SyntaxTrees collection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing funky for C#. VB and My trees is a good thought, but the generators wont get loaded in that case, so this code will never execute.

@@ -102,7 +102,7 @@ public void NavigateToSourceGeneratedFile(Project project, ISourceGenerator gene

// The file name we generate here is chosen to match the compiler's choice, so the debugger can recognize the files should match.
// This can only be changed if the compiler changes the algorithm as well.
var temporaryFilePath = Path.Combine(projectDirectory, $"{generatorType.Module.ModuleVersionId}_{generatorType.FullName}_{generatedSourceHintName}");
var temporaryFilePath = Path.Combine(projectDirectory, generatorType.Assembly.GetName().Name ?? string.Empty, generatorType.FullName, generatedSourceHintName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we ever have an assembly with no name? I presume that means the file pathing would get screwed up since we'll have one less file component, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually have no idea, and couldn't easily make a test that did. The file path will just be /[typename]/[hintName.cs] in those cases. So we might see some oddities in file mathching in the IDE, but it shouldn't crash and burn at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the concern here being the "assembly" name will then actually be the projectDirectory name and matching won't work at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Basically if you somehow pass a generator from an assembly with a null name, you wont be able to look at its files in the IDE. (If someone legitimately has a use case, we can fix it later on).

@@ -61,7 +61,7 @@ private static Project AddEmptyProject(Solution solution)

Assert.NotSame(originalCompilation, newCompilation);
var generatedTree = Assert.Single(newCompilation.SyntaxTrees);
Assert.Equal($"{typeof(GenerateFileForEachAdditionalFileWithContentsCommented).Module.ModuleVersionId}_{typeof(GenerateFileForEachAdditionalFileWithContentsCommented).FullName}_Test.generated.cs", Path.GetFileName(generatedTree.FilePath));
Assert.Equal("Test.generated.cs", Path.GetFileName(generatedTree.FilePath));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Path.GetFileName here and have it assert the full name, since we should know the full path that's being applied?

if (hasGeneratedOutputPath)
{
var path = Path.Combine(Arguments.GeneratedFilesOutputDirectory, tree.FilePath);
if (Directory.Exists(Arguments.GeneratedFilesOutputDirectory))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this check? Directory.CreateDirectory() will create it for you...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will, but we want it to explicitly error if the directory you've given us doesn't exist. We're saying 'we'll create the subdirectories, but not the main one'.

Now, the msbuild logic will create the output directory for you if you create it, but we don't really want to be in the business of creating arbitrary directory structure inside the compiler.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we don't really want to be in the business of creating arbitrary directory structure inside the compiler.

I guess it seemed odd to actively not be in the business, since you're still creating all the other directories.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but we're creating a known set of sub-directories, not some arbitrary random path. I'm on the fence tbh, but we have tests covering the behavior so for now, #WontFix

Comment on lines +295 to +297
<MakeDir Condition="'$(CompilerGeneratedFilesOutputPath)' != ''"
Directories="$(CompilerGeneratedFilesOutputPath)" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this run during a design time build? It's probably not the worst thing to be doing as there's far greater sins...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabled it for design time builds.

CompilerGeneratedFilesOutputPath controls the location the files will be output to.
The compiler will not emit any generated files when the path is empty, and defaults to a /generated directory in $(IntermediateOutputPath) if $(IntermediateOutputPath) has a value.
EmitCompilerGeneratedFiles allows the user to control if anything is emitted by clearing the property when not true.
When CompilerGeneratedFilesOutputPath is set we ensure the directory exists, and issue a warning if EmitCompilerGeneratedFiles is true, but no CompilerGeneratedFilesOutputPath is set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd break this "When..." sentence up into two sentences, one with the set and unset case. Otherwise it's confusing whether the "and issue a warning if true" applies to which conditional. (This read as "If A, we'll do B when A isn't true.")

Comment on lines +292 to +294
<Warning Condition="'$(CompilerGeneratedFilesOutputPath)' == ''"
Text="EmitCompilerGeneratedFiles was true, but no CompilerGeneratedFilesOutputPath was provided. CompilerGeneratedFilesOutputPath must be set in order to emit generated files." />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this could only really happen if line 285 failed to run, which is only if you don't have an intermediate output path to start with I guess?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(seems unlikely, but sure. 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it covers the case that you don't have an intermediate directory set, and you didn't explicitly provide a generated files output.

- Add new command line param 'GeneratedFilesOut'
- Route new param through tasks with defaults
- Write out generated files when requested to do so
- Rationalize post generation to avoid uncesseary work
- Add Tests for targets and emit
- Rename MSBuild properties
- Add test to track the fact that we can overwrite files on windows in SxS cases
@chsienki chsienki merged commit de348c5 into dotnet:release/dev16.8 Sep 25, 2020
jasonmalinowski added a commit to jasonmalinowski/roslyn that referenced this pull request Oct 6, 2020
dotnet#47047 changed the logic for the
file name and path we generate, but we were no longer ensuring the
temporary directory existed prior to trying to create the file
within it.
kzu added a commit to devlooped/ThisAssembly that referenced this pull request Dec 11, 2020
This no longer needed since intellisense works just fine nowadays.

In addition, if persisting the sources for inspection is required, the following two MSBuild properties can be used to persist them: `EmitCompilerGeneratedFiles` and (optionally) `CompilerGeneratedFilesOutputPath`. If no custom path is provided, files will be persisted to the `$(IntermediateOutputPath)/generated/[GeneratorAssembly]/[GeneratorTypeFullName]` folder.

See also dotnet/roslyn#47047.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change the file name that source generators write into the PDB
8 participants