First analysis is much slower on Windows after a reboot (because of Windows Defender scanning files as they're read)

There have been a few issues about the first analysis on Windows being really slow. Some have noted it's only the first time after a reboot, for example:

- https://github.com/Dart-Code/Dart-Code/issues/5251
- https://github.com/dart-lang/sdk/issues/52947#issuecomment-1877514843

When it came up again recently I spent some time trying to reproduce it (I had in the past, but was failing recently) and discovered that in my Windows Defender settings, I had exclusions for:

- The dev folder where my test project was
- My whole PubCache folder
- My `.dartServer` folder

When I removed those exclusions, I was able to reproduce 30+ seconds of initial analysis after a reboot (both via IDEs and `dart analyze`), with analysis being just a few seconds subsequently. It appears that the first read of a file is significantly slower due to scanning by Windows Defender and because the analysis server reads a lot of files (from `.dartServer\.analysis-driver`) synchronously (and sequentially), this has a big impact on the initial analysis time (after a reboot).

I was able to reproduce the same behaviour from a standalone script that just tries to read the same files that the analyzer ends up reading at startup:

```dart
import 'dart:io';
import 'files.dart'; // A list of 2760 files that exist in '.dartServer\.analysis-driver'

var totalBytes = 0;
var totalFiles = 0;

void main(List<String> arguments) {
  var sw = Stopwatch()..start();
  files.map(read).toList();
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

void read(String path) {
  totalBytes += File(path).readAsBytesSync().length;
  totalFiles++;
}
```

Results look like:

```
// First:
// Took 0:00:15.280430 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.123502 to read 34865125 bytes from 2760 files
// Took 0:00:00.115271 to read 34865125 bytes from 2760 files
// Took 0:00:00.112667 to read 34865125 bytes from 2760 files
```

[Dart generally discourages async IO](https://dart.dev/tools/linter-rules/avoid_slow_async_io) but I was curious what the difference would be if we were able to trigger the reads all up-front simultaneously, so I wrote an equivalent script that uses async IO (and triggered `readBytes` all at the same time - although this is something that the analyzer is unlikely to be able to do):

```dart
import 'dart:io';
import 'files.dart';

var totalBytes = 0;
var totalFiles = 0;

Future<void> main(List<String> arguments) async {
  var sw = Stopwatch()..start();
  await Future.wait(files.map(readAsync));
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

Future<void> readAsync(String path) async {
  var bytes = (await File(path).readAsBytes()).length;
  totalBytes += bytes;
  totalFiles++;
}
```

The results are (perhaps unsurprisingly) much better that way:

```
// First:
// Took 0:00:01.142500 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.156074 to read 34865125 bytes from 2760 files
// Took 0:00:00.160162 to read 34865125 bytes from 2760 files
// Took 0:00:00.158144 to read 34865125 bytes from 2760 files
```

Since changing sync->async isn't always easy, I also measured opening/closing all files asynchronously in advance of reading them all synchronously to see how that worked (this results in duplicate work but avoids changing the real "implementation" from sync):

```dart
import 'dart:io';
import 'files.dart';

var totalBytes = 0;
var totalFiles = 0;

Future<void> main(List<String> arguments) async {
  var sw = Stopwatch()..start();
  // Open and close every file 😬
  await Future.wait(files.map((path) async => (await File(path).open()).close()));
  // Original sync work
  files.map(read).toList();
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

void read(String path) {
  totalBytes += File(path).readAsBytesSync().length;
  totalFiles++;
}
```

The results looked like:

```
// First:
// Took 0:00:03.723306 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.187244 to read 34865125 bytes from 2760 files
// Took 0:00:00.113760 to read 34865125 bytes from 2760 files
// Took 0:00:00.185479 to read 34865125 bytes from 2760 files
```

Given how the filenames are computed, I'm not sure if it's feasible to change anything in the analyzer to improve this, nor am I sure how worthwhile trying to solve this is (probably most Windows users are affected to some degree, but only the first time they open their projects after a reboot and the impact might depend on the specs of their machine and size of their workspace). I thought it was worth capturing this here though in case others have ideas, or at least as a summary that can be referenced by other issues that might be related.

(@bwilkerson @scheglov FYI)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

First analysis is much slower on Windows after a reboot (because of Windows Defender scanning files as they're read) #56755

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

First analysis is much slower on Windows after a reboot (because of Windows Defender scanning files as they're read) #56755

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions