Skip to content

First analysis is much slower on Windows after a reboot (because of Windows Defender scanning files as they're read) #56755

Open
@DanTup

Description

@DanTup

There have been a few issues about the first analysis on Windows being really slow. Some have noted it's only the first time after a reboot, for example:

When it came up again recently I spent some time trying to reproduce it (I had in the past, but was failing recently) and discovered that in my Windows Defender settings, I had exclusions for:

  • The dev folder where my test project was
  • My whole PubCache folder
  • My .dartServer folder

When I removed those exclusions, I was able to reproduce 30+ seconds of initial analysis after a reboot (both via IDEs and dart analyze), with analysis being just a few seconds subsequently. It appears that the first read of a file is significantly slower due to scanning by Windows Defender and because the analysis server reads a lot of files (from .dartServer\.analysis-driver) synchronously (and sequentially), this has a big impact on the initial analysis time (after a reboot).

I was able to reproduce the same behaviour from a standalone script that just tries to read the same files that the analyzer ends up reading at startup:

import 'dart:io';
import 'files.dart'; // A list of 2760 files that exist in '.dartServer\.analysis-driver'

var totalBytes = 0;
var totalFiles = 0;

void main(List<String> arguments) {
  var sw = Stopwatch()..start();
  files.map(read).toList();
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

void read(String path) {
  totalBytes += File(path).readAsBytesSync().length;
  totalFiles++;
}

Results look like:

// First:
// Took 0:00:15.280430 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.123502 to read 34865125 bytes from 2760 files
// Took 0:00:00.115271 to read 34865125 bytes from 2760 files
// Took 0:00:00.112667 to read 34865125 bytes from 2760 files

Dart generally discourages async IO but I was curious what the difference would be if we were able to trigger the reads all up-front simultaneously, so I wrote an equivalent script that uses async IO (and triggered readBytes all at the same time - although this is something that the analyzer is unlikely to be able to do):

import 'dart:io';
import 'files.dart';

var totalBytes = 0;
var totalFiles = 0;

Future<void> main(List<String> arguments) async {
  var sw = Stopwatch()..start();
  await Future.wait(files.map(readAsync));
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

Future<void> readAsync(String path) async {
  var bytes = (await File(path).readAsBytes()).length;
  totalBytes += bytes;
  totalFiles++;
}

The results are (perhaps unsurprisingly) much better that way:

// First:
// Took 0:00:01.142500 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.156074 to read 34865125 bytes from 2760 files
// Took 0:00:00.160162 to read 34865125 bytes from 2760 files
// Took 0:00:00.158144 to read 34865125 bytes from 2760 files

Since changing sync->async isn't always easy, I also measured opening/closing all files asynchronously in advance of reading them all synchronously to see how that worked (this results in duplicate work but avoids changing the real "implementation" from sync):

import 'dart:io';
import 'files.dart';

var totalBytes = 0;
var totalFiles = 0;

Future<void> main(List<String> arguments) async {
  var sw = Stopwatch()..start();
  // Open and close every file 😬
  await Future.wait(files.map((path) async => (await File(path).open()).close()));
  // Original sync work
  files.map(read).toList();
  print('Took ${sw.elapsed} to read $totalBytes bytes from $totalFiles files');
}

void read(String path) {
  totalBytes += File(path).readAsBytesSync().length;
  totalFiles++;
}

The results looked like:

// First:
// Took 0:00:03.723306 to read 34865125 bytes from 2760 files
// Subsequent:
// Took 0:00:00.187244 to read 34865125 bytes from 2760 files
// Took 0:00:00.113760 to read 34865125 bytes from 2760 files
// Took 0:00:00.185479 to read 34865125 bytes from 2760 files

Given how the filenames are computed, I'm not sure if it's feasible to change anything in the analyzer to improve this, nor am I sure how worthwhile trying to solve this is (probably most Windows users are affected to some degree, but only the first time they open their projects after a reboot and the impact might depend on the specs of their machine and size of their workspace). I thought it was worth capturing this here though in case others have ideas, or at least as a summary that can be referenced by other issues that might be related.

(@bwilkerson @scheglov FYI)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3A lower priority bug or feature requestanalyzer-stabilityarea-dart-modelFor issues related to conformance to the language spec in the parser, compilers or the CLI analyzer.type-enhancementA request for a change that isn't a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions