Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Allow compilation to in memory stream, part 1: Fsc output #715

Closed
wants to merge 57 commits into from

Conversation

enricosada
Copy link
Contributor

this pr is the first of a multuple set to allow in memory compilation and output

see fsharp/fsharp-compiler-docs#266
user voice: https://fslang.uservoice.com/forums/245727-f-language/suggestions/6717137-allow-for-completely-in-memory-compilation

the refactoring plan is to start with the lower level, convert paths to stream factories and bubble up these arguments up to fsc main.
There we can build a better api for in memory compilation ( really nice for testing or compiler service i think)

current fsc main:

    main0(argv,bannerAlreadyPrinted,exiter, errorLoggerProvider, disposables)
    |> main1
    |> main2
    |> main2b
    |> main2c
    |> main3 
    |> main4

inside main0 the command line argument are parsed and outfile/pdbfile path is set and passed to all other funcions ( main1 to main4 )

this pr add output/pdb/mdb to stream or file.
The output of assembly directly to memorystream works, pdb/mdb/signed require disk io with temp file.

the api to specify where send out/pdb/mdb is

type EmitTo =
    | EmittedFile of string
    | EmittedStream of Stream

with error and deferred creation

type EmitStreamProvider = Lazy<Choice<EmitTo, Diagnostic>>
type Diagnostic = string
  1. Choice<EmitTo, Diagnostic>:
    • to use success/fail pattern instead of exception based control flow
    • we can refactor (later) the control flow as computation expression
  2. Lazy:
    • deferred creation
    • once
    • we can ask ( Lazy.IsValueCreated ) if the factory is called, and easy cleanup the files on error
  3. both EmittedFile of string and EmittedStream of Stream , not only Stream
    • performance, useless continue to copy stream (build+sign, pdb) if i need a real file
    • current implementation (pdb/sign) require file on disk, useless hide that complexity. if we later improve pdb/sign, this api is ok
    • easy to consume api

As side note, roslyn do pretty much the same with EmitStreamProvider, so we can wrap later when a bridge is ready ( working on a partial bridge 😄 )

i need a feedback about naming, i think is weak (for example EmittedFile -> EmitToFile? dunno)

@dsyme
Copy link
Contributor

dsyme commented Nov 7, 2015

@enricosada To clarify - this looks like an FCS feature, but you'd like the core implementation diff that enables the feature to be integrated into Microsoft/visualfsharp to keep maximal code alignment? If so I am assuming that you are confident that the feature is non-intrusive and non-risky. Also, just to check, do you think it will commute well with the CoreCLR work?

@KevinRansom If so I'm ok with the approach of commiting the core of FCS implementation feature work to Microsoft/visualfsharp first, then integrating through to FCS. If you like that approach, then we should probably bring over the core implementation diff of a couple of other FCS features like in-memory cross-project references too. That will continue to put us in a really good position to fully align in the future (e.g. having the ability to build compatible FSharp.Compiler.Service DLLs out of Microsoft/visualfsharp).

@enricosada If @KevinRansom doesn't feel able to pull in the next couple of days for stability reasons, you can submit first to FCS. Make sure the diff is absolutely minimal.

Thanks
Don

@@ -222,7 +222,6 @@ type PdbData =
//---------------------------------------------------------------------

let WritePdbInfo fixupOverlappingSequencePoints showTimes f fpdb info =
(try FileSystem.FileDelete fpdb with _ -> ())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this removal for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed because useless, the file is now removed in serializeToFile (the same for the .mdb)
I'll remove the delete inside serializeToFile and leave that one, less diff

@dsyme
Copy link
Contributor

dsyme commented Nov 7, 2015

Added a bunch of comments - thanks! It's great to have this underway - FCS can really benefit from this feature.

@enricosada
Copy link
Contributor Author

@dsyme really thx for feedback, i'll fix issues.

I wrote this internal refactoring as less intrusive as possibile, with minimal diff in mind.
I sent the pr there because i'd like a smooth integration with visualfsharp, instead of the difficult merge of multiple commit later when the whole in-memory compilation is ready in fcs.
I really like to minimize merge efforts, it'is wasted time.

I think after this one is merged in visualfsharp, fcs can pull and we can deliver the in-memory compilation without waiting visualfsharp if needed (i'd like to help too).

@@ -1840,30 +1861,38 @@ let main1(tcGlobals, tcImports: TcImports, frameworkTcImports, generatedCcu, typ
use unwindBuildPhase = PushThreadBuildPhaseUntilUnwind (BuildPhase.Output)
if tcConfig.printSignature then InterfaceFileWriter.WriteInterfaceFile (tcGlobals,tcConfig, InfoReader(tcGlobals,tcImports.GetImportMap()), typedAssembly);
ReportTime tcConfig ("Write XML document signatures")
if tcConfig.xmlDocOutputFile.IsSome then
match xmlDocOutputOpt with
| Some(_) ->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

converted from if to match because i need type inference, otherwise i must specify method signature for xmlDocOutputOpt param (more invasive diff)

@enricosada
Copy link
Contributor Author

ok, i need to do more testing but that should be ok to review at least /cc @dsyme. I'll do more testing asap

I changed only internal functions inside fsc.fs, moving all binding one step up in the call stack (main4 -> main3 -> .. main0 ) so the previous review of the others files is ok.

i moved some string path -> EmitTo where possible (like xmldoc)

I cannot move the outfile string path upper in the call stack because i need to change TastPickle.pickleObjWithDanglingCcus signature inside EncodeOptimizationData, but this pr is already big. i can continue later with another.

btw a note, i think we need to change some members of TcConfig, like TcConfig.outputFile from string option to EmitTo, so it's easier to propagate the change from path to path or stream. Typed refactoring is really easier.

After that, we can use a wrapper of main1 as api in the compiler, so a clean api but the same internal code of the compiler, that should help compiler service (or internally the vs extensions)

I know it's seem useless now, but i really think this refactoring can help with the coreclr because now the code is easier to change (add portable pdb output as another parameter for example)

Each step main0 -> main1 -> main2 -> ... remove arguments, instead of create new ones or local bindings, so it's easier to understand.

| ILBinaryWriter.EmitTo.Stream stream ->
stream.Write(resource.Bytes, 0, resource.Bytes.Length)
| ILBinaryWriter.EmitTo.File sigDataFileName ->
File.WriteAllBytes(sigDataFileName,resource.Bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a .WriteAllBytes instance method to ILBinaryWriter.EmitTo since this match-and-write-all-bytes is used in multiple places.

let sigDataFileName = (Filename.chopExtension outfile)+".fsdata"
File.WriteAllBytes(sigDataFileName,bytes)
//TODO instead of bytes array we can directly write to stream, less memory pressure
bytes |> emitAllBytes fsDataP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it looks to me like useSignatureDataFile is always false and this is dead code, as is anything to do with fsData files.

@enricosada
Copy link
Contributor Author

i'll rebase to fix emrge conflict

@KevinRansom
Copy link
Member

@enricosada Enrico, in your opinion is this ready to merge?

@enricosada
Copy link
Contributor Author

Yes, after rebase.

@enricosada
Copy link
Contributor Author

I'll do that soon, this week

@dsyme dsyme changed the title Allow compilation to in memory stream, part 1: Fsc output [WIP] Allow compilation to in memory stream, part 1: Fsc output Jan 25, 2016
@dsyme
Copy link
Contributor

dsyme commented Jan 25, 2016

Marked WIP until rebase is done (please remove once it is done)

@dsyme
Copy link
Contributor

dsyme commented Mar 2, 2016

@enricosada Can you bring this up to date with CI tests passing please? It would be good to get it merged. Thanks

@enricosada
Copy link
Contributor Author

@dsyme i'll wait the merge of coreclr branch inside master.
I dont want to make the merge more difficult for @KevinRansom because of pdb work

@KevinRansom
Copy link
Member

@enricosada coreclr is now merged into master and pretty stable, would you like to take care of this?

Thanks

Kevin

@KevinRansom
Copy link
Member

@enricosada,

Hey ... do you want to finish this off or shall I close this PR?

Thanks

Kevin

@enricosada
Copy link
Contributor Author

i'll close now and reopen a new pr after a rebase, too much diff atm

@enricosada enricosada closed this Jun 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants