Investigate improvements to our disk usage on file timestamp checks #5972

marcpopMSFT · 2020-12-14T23:28:04Z

One option brought up for improving performance has been to reduce the amount that we check files on disks. We discussed possibly having a file watcher or using one that's already out there like facebook/watchman. We could also potentially share watchers with the various IDEs we plug into as they already have watchers so alternatively we could build a way for them to plug into that and share their results with other tools through MSBuild.

This will require a lot more investigation so posting the summary for now for tracking.

benvillalobos · 2021-01-06T16:59:11Z

From #6001

By @KirillOsenkov
https://source.dot.net/#Microsoft.Build.Tasks.Core/Copy.cs,8c737dc13878fc22

I've been profiling and IsMatchingSizeAndTimeStamp is showing up a lot:

We should investigate if we can use a single file system call that gets both existence and timestamp in a single call.

On Windows calling FindFirstFile is almost sure to be way faster and with lower overhead.

ladipro · 2021-09-22T14:38:23Z

We should investigate if we can use a single file system call that gets both existence and timestamp in a single call.

On Windows calling FindFirstFile is almost sure to be way faster and with lower overhead.

@KirillOsenkov IsMatchingSizeAndTimeStamp it is currently using:

msbuild/src/Tasks/FileState.cs

Line 104 in 2f1e9ca

bool success = NativeMethodsShared.GetFileAttributesEx(_filename, 0, ref data);

Traces show more than one stack because there are multiple ways to initialize the Lazy, but under the covers it is always this one single call per file.

^ we're spending ~6 seconds of CPU here when building OrchardCore, definitely something to improve.

KirillOsenkov · 2021-09-24T05:56:23Z

Do we know if we're calling these multiple times for the same file? Are they cached? If they're called multiple times, how many times per file? Should we perhaps scan entire directories and then answer these questions based on the info we read from the file system? The assumption is that it could be better to scan a directory once, keep the file lists, sizes and timestamps in memory, and then answer questions off of that table (at least during evaluation where presumably the underlying file system shouldn't change).

ladipro · 2021-10-08T07:53:29Z

I have prototyped a cache with FileSystemWatcher-based invalidation to measure the impact on incremental build. The cache is called from the biggest offenders in terms of file metadata checks (existence, timestamp, size, attributes):

NativeMethodsShared.GetLastWriteFileUtcTime
NativeMethodsShared.GetLastWriteDirectoryUtcTime
FileState.FileDirInfo (as used by the Copy task)

The following table summarizes the CPU time burned when doing a no-change incremental build of OrchardCore. It is a cold scenario where no MSBuild process is running when the build is initiated (i.e. in-memory caches are empty).

	CPU total	CPU in kernel32!GetFileAttributesEx	CPU overhead of caching
Baseline	230 s	79 s	-
Immutable files in static cache (#6761)	185 s	35 s	< 1 s
Immutable files in static cache + mutable files in file watcher cache	165 s	14 s	< 1 s

Watchers were set up on demand for each directory with at least one affected file. At the end of the build MSBuild was watching 3400 directories and had cached metadata on 33000 files total. There was no attempt to unify/coalesce directories to reduce the number of directories watched.

The perf numbers are great but there's a major catch. File watcher notifications come asynchronously on a separate thread and there are no guarantees that a notification for a file arrives and is processed before we make a build decision based on the file.

Here's a simple program demonstrating the asynchrony:

        static volatile bool fileExists = false;
        static void Main(string[] args)
        {
            string tempDir = Path.Combine(Path.GetTempPath(), Guid.NewGuid().ToString());
            Directory.CreateDirectory(tempDir);
            try
            {
                FileSystemWatcher watcher = new FileSystemWatcher(tempDir);
                watcher.Created += (object sender, FileSystemEventArgs e) => fileExists = true;
                watcher.Deleted += (object sender, FileSystemEventArgs e) => fileExists = false;
                watcher.EnableRaisingEvents = true;

                string tempFile = Path.Combine(tempDir, "File.txt");
                using (var fileStream = File.Create(tempFile))
                {
                    Console.WriteLine($"File exists after Create: {fileExists}");
                }
                File.Delete(tempFile);
                Console.WriteLine($"File exists after Delete: {fileExists}");
            }
            finally
            {
                Directory.Delete(tempDir, true);
            }
            Console.ReadLine();
        }

On my Windows machine the program usually prints:

File exists after Create: False
File exists after Delete: True

which is exactly the opposite of the correct output. We are not aware of the file immediately after it was created and we believe that it's still there right after deleting it. In this simple example the race window is small but there are no guarantees that it couldn't get much larger on a different platform, on a loaded system, or with other I/O patterns. This is a no-go for MSBuild. Our I/O has to be synchronous, or at least we would need a way to issue an "I/O barrier" before each metadata check to make sure that all watchers queued to run for I/O writes that happened before the current point in time have actually finished running.

Even for evaluation which is supposed to be a read-only operation so it's OK to take a snapshot and ignore changes, we would be running into races if we attempted to cache metadata using watchers between evaluations.

I am closing the issue based on the information above.

@KirillOsenkov:

Do we know if we're calling these multiple times for the same file? Are they cached? If they're called multiple times, how many times per file?

Please see #6822 (comment) for data on this.

The assumption is that it could be better to scan a directory once, keep the file lists, sizes and timestamps in memory, and then answer questions off of that table (at least during evaluation where presumably the underlying file system shouldn't change).

Yes, it should be possible during evaluation. Outside of evaluation I'm afraid we could be running into same races as above. We are tracking it in #3586.

marcpopMSFT mentioned this issue Dec 14, 2020

Improve performance of CLI/SDK/MSBuild as it relates to the developer inner-loop #5876

Closed

12 tasks

benvillalobos mentioned this issue Jan 6, 2021

Copy task: IsMatchingSizeAndTimeStamp is a bottleneck #6001

Closed

rokonec added the backlog label Apr 16, 2021

ladipro mentioned this issue Aug 13, 2021

Epic: Make up-to-date checks faster #6757

Closed

7 tasks

ladipro removed the backlog label Aug 13, 2021

ladipro added the size:5 label Sep 13, 2021

ladipro self-assigned this Sep 13, 2021

ladipro closed this as completed Oct 8, 2021

ladipro added this to the VS 17.1 milestone Dec 8, 2021

AR-May added the triaged label Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate improvements to our disk usage on file timestamp checks #5972

Investigate improvements to our disk usage on file timestamp checks #5972

marcpopMSFT commented Dec 14, 2020

benvillalobos commented Jan 6, 2021

ladipro commented Sep 22, 2021

KirillOsenkov commented Sep 24, 2021

ladipro commented Oct 8, 2021

Investigate improvements to our disk usage on file timestamp checks #5972

Investigate improvements to our disk usage on file timestamp checks #5972

Comments

marcpopMSFT commented Dec 14, 2020

benvillalobos commented Jan 6, 2021

From #6001

ladipro commented Sep 22, 2021

KirillOsenkov commented Sep 24, 2021

ladipro commented Oct 8, 2021