Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate improvements to our disk usage on file timestamp checks #5972

Closed
Tracked by #6757
marcpopMSFT opened this issue Dec 14, 2020 · 4 comments
Closed
Tracked by #6757
Assignees
Labels
Cost:L Work that requires one engineer up to 4 weeks performance Performance-Scenario-Build This issue affects build performance. Priority:1 Work that is critical for the release, but we could probably ship without size:5 triaged User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@marcpopMSFT
Copy link
Member

One option brought up for improving performance has been to reduce the amount that we check files on disks. We discussed possibly having a file watcher or using one that's already out there like facebook/watchman. We could also potentially share watchers with the various IDEs we plug into as they already have watchers so alternatively we could build a way for them to plug into that and share their results with other tools through MSBuild.

This will require a lot more investigation so posting the summary for now for tracking.

@marcpopMSFT marcpopMSFT added performance Performance-Scenario-Build This issue affects build performance. User Story A single user-facing feature. Can be grouped under an epic. Priority:1 Work that is critical for the release, but we could probably ship without Cost:L Work that requires one engineer up to 4 weeks labels Dec 14, 2020
@benvillalobos
Copy link
Member

From #6001

By @KirillOsenkov
https://source.dot.net/#Microsoft.Build.Tasks.Core/Copy.cs,8c737dc13878fc22

I've been profiling and IsMatchingSizeAndTimeStamp is showing up a lot:
image

We should investigate if we can use a single file system call that gets both existence and timestamp in a single call.

On Windows calling FindFirstFile is almost sure to be way faster and with lower overhead.

@ladipro
Copy link
Member

ladipro commented Sep 22, 2021

We should investigate if we can use a single file system call that gets both existence and timestamp in a single call.

On Windows calling FindFirstFile is almost sure to be way faster and with lower overhead.

@KirillOsenkov IsMatchingSizeAndTimeStamp it is currently using:

bool success = NativeMethodsShared.GetFileAttributesEx(_filename, 0, ref data);

Traces show more than one stack because there are multiple ways to initialize the Lazy, but under the covers it is always this one single call per file.

image

^ we're spending ~6 seconds of CPU here when building OrchardCore, definitely something to improve.

@KirillOsenkov
Copy link
Member

Do we know if we're calling these multiple times for the same file? Are they cached? If they're called multiple times, how many times per file? Should we perhaps scan entire directories and then answer these questions based on the info we read from the file system? The assumption is that it could be better to scan a directory once, keep the file lists, sizes and timestamps in memory, and then answer questions off of that table (at least during evaluation where presumably the underlying file system shouldn't change).

@ladipro
Copy link
Member

ladipro commented Oct 8, 2021

I have prototyped a cache with FileSystemWatcher-based invalidation to measure the impact on incremental build. The cache is called from the biggest offenders in terms of file metadata checks (existence, timestamp, size, attributes):

  • NativeMethodsShared.GetLastWriteFileUtcTime
  • NativeMethodsShared.GetLastWriteDirectoryUtcTime
  • FileState.FileDirInfo (as used by the Copy task)

The following table summarizes the CPU time burned when doing a no-change incremental build of OrchardCore. It is a cold scenario where no MSBuild process is running when the build is initiated (i.e. in-memory caches are empty).

CPU total CPU in kernel32!GetFileAttributesEx CPU overhead of caching
Baseline 230 s 79 s -
Immutable files in static cache (#6761) 185 s 35 s < 1 s
Immutable files in static cache + mutable files in file watcher cache 165 s 14 s < 1 s

Watchers were set up on demand for each directory with at least one affected file. At the end of the build MSBuild was watching 3400 directories and had cached metadata on 33000 files total. There was no attempt to unify/coalesce directories to reduce the number of directories watched.

The perf numbers are great but there's a major catch. File watcher notifications come asynchronously on a separate thread and there are no guarantees that a notification for a file arrives and is processed before we make a build decision based on the file.

Here's a simple program demonstrating the asynchrony:

        static volatile bool fileExists = false;
        static void Main(string[] args)
        {
            string tempDir = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString());
            Directory.CreateDirectory(tempDir);
            try
            {
                FileSystemWatcher watcher = new FileSystemWatcher(tempDir);
                watcher.Created += (object sender, FileSystemEventArgs e) => fileExists = true;
                watcher.Deleted += (object sender, FileSystemEventArgs e) => fileExists = false;
                watcher.EnableRaisingEvents = true;

                string tempFile = Path.Combine(tempDir, "File.txt");
                using (var fileStream = File.Create(tempFile))
                {
                    Console.WriteLine($"File exists after Create: {fileExists}");
                }
                File.Delete(tempFile);
                Console.WriteLine($"File exists after Delete: {fileExists}");
            }
            finally
            {
                Directory.Delete(tempDir, true);
            }
            Console.ReadLine();
        }

On my Windows machine the program usually prints:

File exists after Create: False
File exists after Delete: True

which is exactly the opposite of the correct output. We are not aware of the file immediately after it was created and we believe that it's still there right after deleting it. In this simple example the race window is small but there are no guarantees that it couldn't get much larger on a different platform, on a loaded system, or with other I/O patterns. This is a no-go for MSBuild. Our I/O has to be synchronous, or at least we would need a way to issue an "I/O barrier" before each metadata check to make sure that all watchers queued to run for I/O writes that happened before the current point in time have actually finished running.

Even for evaluation which is supposed to be a read-only operation so it's OK to take a snapshot and ignore changes, we would be running into races if we attempted to cache metadata using watchers between evaluations.

I am closing the issue based on the information above.

@KirillOsenkov:

Do we know if we're calling these multiple times for the same file? Are they cached? If they're called multiple times, how many times per file?

Please see #6822 (comment) for data on this.

The assumption is that it could be better to scan a directory once, keep the file lists, sizes and timestamps in memory, and then answer questions off of that table (at least during evaluation where presumably the underlying file system shouldn't change).

Yes, it should be possible during evaluation. Outside of evaluation I'm afraid we could be running into same races as above. We are tracking it in #3586.

@ladipro ladipro closed this as completed Oct 8, 2021
@ladipro ladipro added this to the VS 17.1 milestone Dec 8, 2021
@AR-May AR-May added the triaged label Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cost:L Work that requires one engineer up to 4 weeks performance Performance-Scenario-Build This issue affects build performance. Priority:1 Work that is critical for the release, but we could probably ship without size:5 triaged User Story A single user-facing feature. Can be grouped under an epic.
Projects
None yet
Development

No branches or pull requests

6 participants