Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HostWriter Performance Improvements #48774

Merged
merged 4 commits into from
Mar 3, 2021
Merged

Conversation

brianrob
Copy link
Member

  • Use CopyOnWrite MemoryMappedFiles to avoid copying the template host.
  • Avoid a file open/read when attempting to remove a MachO signature when we know the input host is a PE file.

…'t attempt to remove the MachO signature for known PE files.
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-HostModel Microsoft.NET.HostModel issues label Feb 25, 2021
@ghost
Copy link

ghost commented Feb 25, 2021

Tagging subscribers to this area: @vitek-karas, @agocke
See info in area-owners.md if you want to be subscribed.

Issue Details
  • Use CopyOnWrite MemoryMappedFiles to avoid copying the template host.
  • Avoid a file open/read when attempting to remove a MachO signature when we know the input host is a PE file.
Author: brianrob
Assignees: -
Labels:

area-HostModel

Milestone: -

@brianrob
Copy link
Member Author

@vitek-karas, @agocke I've marked this as a draft PR so that I can get some additional coverage. At the same time, I'm interested in your thoughts. I'm looking to improve this since it shows up as an area of opportunity in the dev inner-loop performance effort, and the file copy has a relatively heavy impact on Windows when antivirus is involved. I still need to put together some comparison numbers to see if this has a material performance benefit.

@brianrob
Copy link
Member Author

On my local machine, I see a net improvement of 66ms (59.45%) in disk and antivirus time for the host exe during _CreateAppHost. This is on a console hello-world build.

Baseline:
image
Total disk/AV time for the host exe: 111ms (5 scan requests)

With this PR:
image
Total disk/AV time for the host exe: 45ms (2 scan requests)

@brianrob
Copy link
Member Author

cc: @DamianEdwards, @stephentoub

…le permission. This is OK, because the installed version of the apphost has -rwxr-xr-x permissions.
@brianrob
Copy link
Member Author

For additional test, I verified with the latest bits that the host, after transform is identical to one without my change on Windows and Linux. It appears that on OSX, we don't emit a host. Is this new? If so, we can probably trim out the MachO signature removal, unless this is just temporary.

@brianrob brianrob marked this pull request as ready for review February 26, 2021 00:30
}
finally
{
if (memoryMappedViewAccessor != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can either of these actually be null here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. As an example, if MemoryMappedFile.CreateFromFile throws, then memoryMappedViewAccessor would be null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oop..

int pos = 0;
int bufSize = 16384; //16K

byte[] buf = new byte[bufSize];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is ArrayPool available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, though I purposely did not use it here. This code is invoked from MSBuild once per-process, and the only existing use of ArrayPool that I've been able to find is from RegularExpressions, which allocations char[]. My goal here is to avoid incurring the cost of initializing an ArrayPool<byte> when we only need one byte[] in this code for the life of the process.

Base automatically changed from master to main March 1, 2021 09:08
@brianrob
Copy link
Member Author

brianrob commented Mar 1, 2021

FYI @ericstj

@ericstj
Copy link
Member

ericstj commented Mar 1, 2021

I believe this will fix #3832

Comment on lines 130 to 132
RetryUtil.RetryOnWin32Error(UpdateResources);

RetryUtil.RetryOnIOError(RemoveSignatureIfMachO);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't both of these reopen the file? Can they be rewritten to use the memory mapped file?

Copy link
Member Author

@brianrob brianrob Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both of these will re-open the file. UpdateResources uses win32 APIs to enumerate and add the resources. There might be a way to do this against the existing open MemoryMappedFile, but likely the cost of doing so is going to be higher than we want to pay, if it's possible. It would require that we re-write the functionality.

RemoveSignatureIfMachO is potentially possible, but it became much harder to implement because the size of the file has to change. That said, I did change the code so that this doesn't get called unconditionally for PE files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may eventually want to rewrite UpdateResources so it can be supported xplat, but the work is not planned right now.

Copy link
Member

@ericstj ericstj Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a trick where you can keep an open handle read/write with sharing and it will prevent some other process from getting exclusive access. So even if you didn't re-implement these over the file mapping you might be able to keep a handle open to prevent interference. This works so long as the Windows APIs you're calling specify read/write sharing. If the Windows APIs themselves don't allow sharing your held handle prevents them from opening the file.
You can procmon the Windows API to see how it opens the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some experiments with this, but it doesn't look like this is possible. The file is loaded via LoadLibraryEx, which does not appear to allow for sharing when an existing handle has write permissions.

Copy link
Member

@ericstj ericstj Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I guess when UpdateResource is rewritten that would fix this. Presumably some code can be copied from CSC since it knows how to do that.

For RemoveSignatureIfMachO just make your new BinaryUtils.SaveFile take a stream and open the stream above. Then change RemoveSignatureIfMachO to take a stream and give it the written file stream (assuming reordering of these ops is safe). This is minimal change and saves one roll-of-the-dice around retries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. At this point it is for sure safe because resources can only be copied into a PEFile, and the signature can only be removed from MachO binaries. So the operations cannot be run on the same binary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought on UpdateResource: right now this needs to be incremental on the binary output of CSC, which means the host needs to be rewritten whenever CSC recompiles. I wonder if instead you could use the input to CSC here (eg: -win32res parameter). Potentially eliminating this task from most normal inner-loop scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting point. I have filed dotnet/sdk#16173 to track this.

Copy link
Member

@agocke agocke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from a few comments. Thanks for the work!

Comment on lines 130 to 132
RetryUtil.RetryOnWin32Error(UpdateResources);

RetryUtil.RetryOnIOError(RemoveSignatureIfMachO);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may eventually want to rewrite UpdateResources so it can be supported xplat, but the work is not planned right now.

Comment on lines 119 to 126
if (memoryMappedViewAccessor != null)
{
memoryMappedViewAccessor.Dispose();
}
if (memoryMappedFile != null)
{
memoryMappedFile.Dispose();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (memoryMappedViewAccessor != null)
{
memoryMappedViewAccessor.Dispose();
}
if (memoryMappedFile != null)
{
memoryMappedFile.Dispose();
}
memoryMappedViewAccessor?.Dispose();
memoryMappedFile?.Dispose();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


RetryUtil.RetryOnIOError(RemoveSignatureIfMachO);

RetryUtil.RetryOnIOError(SetLastWriteTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary if we're calling SaveFile? It looks like the previous concern was that simple memory map modification didn't update the last write time, but I'm unclear if that's also true of SaveFile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It looks like the last write time is updated when we open and write the stream, so I've removed this call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianrob Any comment on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, the diff here just didn't reparent after your changes.

@brianrob
Copy link
Member Author

brianrob commented Mar 3, 2021

I think I've hit all the feedback. @ericstj let me know if you have anything else based on the latest iteration.

Comment on lines 130 to 132
RetryUtil.RetryOnWin32Error(UpdateResources);

RetryUtil.RetryOnIOError(RemoveSignatureIfMachO);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thought on UpdateResource: right now this needs to be incremental on the binary output of CSC, which means the host needs to be rewritten whenever CSC recompiles. I wonder if instead you could use the input to CSC here (eg: -win32res parameter). Potentially eliminating this task from most normal inner-loop scenarios.

@brianrob brianrob merged commit 872d9e4 into dotnet:main Mar 3, 2021
@brianrob brianrob deleted the hostwriter-perf branch March 3, 2021 15:06
@ghost ghost locked as resolved and limited conversation to collaborators Apr 2, 2021
@karelz karelz added this to the 6.0.0 milestone May 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-HostModel Microsoft.NET.HostModel issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants