Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure performance of tools on Dart 2 VM is on par with Dart 1 #33257

Closed
3 tasks
mraleph opened this issue May 28, 2018 · 34 comments
Closed
3 tasks

ensure performance of tools on Dart 2 VM is on par with Dart 1 #33257

mraleph opened this issue May 28, 2018 · 34 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-dart-sass P2 A bug or feature request we're likely to work on
Milestone

Comments

@mraleph
Copy link
Member

mraleph commented May 28, 2018

Currently known regressions:

  • dart2js: around 2x regression in self compilation
  • build_runner: 50% regression when building build_runner's own _test, 2x regression on some individual steps
  • analyzer: indirectly confirmed via build_runner.

Most regressions seem to boil down to the cost of strong mode type checks: with parameter checks and implicit down-casts ignored hit on dart2js goes down to 10%.

@mraleph mraleph added the area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. label May 28, 2018
@mraleph mraleph self-assigned this May 28, 2018
@mraleph mraleph modified the milestones: Dart2Stable, Dart2Prerelease1 May 28, 2018
@mraleph
Copy link
Member Author

mraleph commented May 28, 2018

The initial plan is two adapt AOT specific optimizations developed as part of #31798 to JIT:

  • skip non-generic-covariant type checks on non-dynamically invoked methods;
  • port optimized type checking code that does not rely on caching (type testing stubs);

@mraleph mraleph added the P0 A serious issue requiring immediate resolution label May 28, 2018
dart-bot pushed a commit that referenced this issue Jun 1, 2018
For now we are limiting this to type checks against type parameter types.

# Performance improvements

In Dart 1 mode Dart2JS compiles itself in 28s when running from source
and in 23s when running from ideal app-jit snapshot (trained on the
same workload).

Before this change in Dart 2 mode numbers were 51s and 57s respectively.

After this change in Dart 2 mode numbers are 38s and 32s. Meaning
that regression is reduced by 50%.

Issue #31798
Issue #33257

Change-Id: I34bf5385a5cc3c7702dc281c6dfa89da85d3dde1
Reviewed-on: https://dart-review.googlesource.com/57601
Reviewed-by: Régis Crelier <regis@google.com>
Commit-Queue: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Jun 1, 2018
… types."

This reverts commit 4be50d6.

Reason for revert: Failures on SIMDBC64 and Analyzer bots.

Original change's description:
> [vm] Enable type stubs based type checks in JIT mode for some types.
> 
> For now we are limiting this to type checks against type parameter types.
> 
> # Performance improvements
> 
> In Dart 1 mode Dart2JS compiles itself in 28s when running from source
> and in 23s when running from ideal app-jit snapshot (trained on the
> same workload).
> 
> Before this change in Dart 2 mode numbers were 51s and 57s respectively.
> 
> After this change in Dart 2 mode numbers are 38s and 32s. Meaning
> that regression is reduced by 50%.
> 
> Issue #31798
> Issue #33257
> 
> Change-Id: I34bf5385a5cc3c7702dc281c6dfa89da85d3dde1
> Reviewed-on: https://dart-review.googlesource.com/57601
> Reviewed-by: Régis Crelier <regis@google.com>
> Commit-Queue: Vyacheslav Egorov <vegorov@google.com>

TBR=vegorov@google.com,kustermann@google.com,regis@google.com

Change-Id: I85a30c962b0cd556310e19193f5993ab76ecf2e7
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://dart-review.googlesource.com/57840
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Commit-Queue: Vyacheslav Egorov <vegorov@google.com>
dart-bot pushed a commit that referenced this issue Jun 1, 2018
Relanding 4be50d6 with fixes to DBC
and location summaries: AssertAssignable must save FPU registers.

For now we are limiting this to type checks against type parameter types.


In Dart 1 mode Dart2JS compiles itself in 28s when running from source
and in 23s when running from ideal app-jit snapshot (trained on the
same workload).

Before this change in Dart 2 mode numbers were 51s and 57s respectively.

After this change in Dart 2 mode numbers are 38s and 32s. Meaning
that regression is reduced by 50%.

Issue #31798
Issue #33257

Change-Id: Ifb55f86453bfdf36a2e03bcd7f3197cfde257103
Reviewed-on: https://dart-review.googlesource.com/57980
Commit-Queue: Vyacheslav Egorov <vegorov@google.com>
Reviewed-by: Régis Crelier <regis@google.com>
@dgrove
Copy link
Contributor

dgrove commented Jun 4, 2018

Any further updates here, @mraleph ?

@mraleph
Copy link
Member Author

mraleph commented Jun 4, 2018

The first fix has landed on Friday. The work on another one is starting.

@dgrove
Copy link
Contributor

dgrove commented Jun 7, 2018

Any updates here?

@mraleph
Copy link
Member Author

mraleph commented Jun 7, 2018

No updates at the moment. Still working on this.

@mraleph
Copy link
Member Author

mraleph commented Jun 11, 2018

@handicraftsman please elaborate on the “hmm?” - leaving such comments is not really helpful

@handicraftsman
Copy link

Any news?

@mraleph
Copy link
Member Author

mraleph commented Jun 11, 2018

Still working on this. Have another fix in the development that cuts another 50% of the regression.

@mraleph
Copy link
Member Author

mraleph commented Jun 11, 2018

@handicraftsman is there any particular reason you are interested in this? does your application regress in Dart 2 mode? in this case it would be helpful if you shared some information - so that we could ensure that we address regressions in your applications as well.

@nex3
Copy link
Member

nex3 commented Jun 14, 2018

Sass's speed regresses considerably in Dart 2 mode as well. Here's some data, collected using the same techniques we use to collect cross-implementation performance data, tested against Dart 2.0.0-dev.62.0. Note that due to #28617, we have live users running against both script and application snapshots.

Small Plain SCSS

Running on a file containing 4 instances of .foo {a: b}:

  • Dart 1 mode with a script snapshot: 0.363s
  • Dart 1 mode with an app snapshot: 0.117s
  • Dart 2 mode with a script snapshot: 0.326s
  • Dart 2 mode with an app snapshot: 0.172s

This is effectively just testing startup speed. It looks like Dart 2 actually improves startup speed with a script snapshot, but app snapshot startup speed is about 50% slower.

Large Plain CSS

Running on a file containing 2^17 instances of .foo {a: b}:

  • Dart 1 mode with a script snapshot: 2.479s
  • Dart 1 mode with an app snapshot: 2.066s
  • Dart 2 mode with a script snapshot: 2.893s
  • Dart 2 mode with an app snapshot: 2.647s

This benchmark simulates a large-scale style system. We see about a 15% performance hit for script snapshots, up to about 30% for app snapshots. This is pretty concerning: one of the selling points of Dart Sass is its performance parity with the C++ implementation, but this will break that parity and thereby substantially hurt our value proposition.

@mraleph
Copy link
Member Author

mraleph commented Jun 14, 2018

Thanks for trying this out @nex3 I will make sure to include Dart SASS in the set of tools I am benchmarking against.

@dgrove dgrove modified the milestones: Dart2Prerelease1, Dart2Stable Jun 19, 2018
@nex3
Copy link
Member

nex3 commented Jun 21, 2018

FYI, it looks like the performance of 2.0.0-dev.64.1 is slow enough that it's causing mass timeouts on Travis of tests that invoke Dart subprocesses.

@mraleph
Copy link
Member Author

mraleph commented Jun 22, 2018

@nex3 are you sure it is performance related issue and not a legit timeout (e.g. due to sync-async semantics changes or something like that)?

@nex3
Copy link
Member

nex3 commented Jun 22, 2018

Pretty sure; the tests pass on my fast desktop computer, but time out on Travis.

dart-bot pushed a commit that referenced this issue Jun 25, 2018
…ecking

This forwarders are used at dynamic call-sites and perform type checking
for all non-generic-covariant arguments. This allows to skip the same
type checks in the actual method body.

This yield on average 10% improvement in performance across the body of
benchmarks including dart2js compilation times.

Bug: #33257
Change-Id: If3fc94a2e0a6f496ec0633f0b379d053a54a40ca
Reviewed-on: https://dart-review.googlesource.com/61244
Commit-Queue: Vyacheslav Egorov <vegorov@google.com>
Reviewed-by: Alexander Markov <alexmarkov@google.com>
Reviewed-by: Martin Kustermann <kustermann@google.com>
nex3 added a commit to sass/dart-sass that referenced this issue Jun 25, 2018
This substantially speeds up the test time, and once we re-enable Dart
2 support is should work around dart-lang/sdk#33257.
nex3 added a commit to sass/dart-sass that referenced this issue Jun 27, 2018
This substantially speeds up the test time, and once we re-enable Dart
2 support is should work around dart-lang/sdk#33257.
@JekCharlsonYu JekCharlsonYu modified the milestones: Dart2Stable, Dart2.1 Jul 24, 2018
dart-bot pushed a commit that referenced this issue Jul 28, 2018
This CL introduces, similar to RawClass::direct_subclasses_, a
RawClass::direct_implementors field which is kept up-to-date.

The generation of type testing stubs uses this cached
reverse-relationship information for faster cid-range calculation (time
spent in type testing stub generation is decreased by an order of
magnitude).

This CL also enable type testing stubs for any instantiated type.

This seems to improve

  * analysis-server-cold-analysis by 6%
  * analysis-server-warm-analysis by 11%

as well as other non-analysis benchmarks in JIT mode.

Issue #33257

Change-Id: If01ee06ac08251b2ed27f79d6b82ad7354c8336e
Reviewed-on: https://dart-review.googlesource.com/66576
Commit-Queue: Martin Kustermann <kustermann@google.com>
Reviewed-by: Alexander Markov <alexmarkov@google.com>
Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
@dgrove
Copy link
Contributor

dgrove commented Aug 24, 2018

@mraleph Any updates on this?

@mraleph
Copy link
Member Author

mraleph commented Aug 24, 2018

The progress can be tracked on #31798: we are landing changes, there is currently one CL in flight. With all changes combined I measure this on compiling large Dart application with dart2js:

  • Before recent changes 400-410 seconds
  • After all the changes 365-375 seconds
  • Dart 1 mode: 350 seconds.

I have not measured other tools yet.

@kevmoo kevmoo added P2 A bug or feature request we're likely to work on and removed P1 A high priority bug; for example, a single project is unusable or has many test failures labels Sep 6, 2018
@dgrove
Copy link
Contributor

dgrove commented Oct 9, 2018

@mraleph any updates here? Is anything else happening on this for 2.1?

@mraleph
Copy link
Member Author

mraleph commented Oct 9, 2018

We had to disable one of the key optimizations we did. @mkustermann is working on reenabling it. We are going to return to optimizations after it is re-enabled.

@mraleph mraleph modified the milestones: Dart2.1, Dart2.2 Oct 12, 2018
@mraleph
Copy link
Member Author

mraleph commented Oct 12, 2018

Move to 2.2 because the reland did not make it in time.

@nex3
Copy link
Member

nex3 commented Oct 12, 2018

FYI, between this and the dropped support for --no-preview-dart-2, Dart Sass is blocked on upgrading its Dart version. We're currently pinned to Dart 2.0.0, which is the last stable version that supports the performance we're looking for.

@mkustermann
Copy link
Member

After waiting for more than a week for our buildbot to stabilize (and we still have flaky crashes left), we decided to land the bugfix now (06f9a9e), which also re-enables the optimization.

@dgrove
Copy link
Contributor

dgrove commented Mar 13, 2019

Can this be closed, @mraleph and @mkustermann ?

1 similar comment
@dgrove
Copy link
Contributor

dgrove commented Mar 13, 2019

Can this be closed, @mraleph and @mkustermann ?

@mraleph
Copy link
Member Author

mraleph commented Mar 13, 2019

As far as I know we are still slower than Dart 1 sometimes, but we are currently not actively working on anything to reduce the overhead further.

Need to gather the numbers again.

@sstrickl could you take a stub at measuring where we stand currently with default mode versus --experimental_unsafe_mode_use_at_your_own_risk?

@mraleph mraleph assigned sstrickl and unassigned mraleph Mar 13, 2019
@mit-mit mit-mit modified the milestones: Dart 2.2, Dart 2.4 Apr 8, 2019
@mit-mit
Copy link
Member

mit-mit commented Apr 8, 2019

@sstrickl can we can an update on where we currently stand?

@sstrickl
Copy link
Contributor

sstrickl commented Apr 8, 2019

Just coming back to this after finishing up the RegExp work I was doing, so I'll try and have some initial numbers in here later this week. My initial benchmarking plans went awry when dart2js crashed while compiling the large code bases I was going to test it on, so I need to pick some alternative codebases for benchmarking. (That, and actually file an issue re: the crashes--I did let a member of the team know about them, but I should still create the issue as well for tracking.)

@sstrickl
Copy link
Contributor

sstrickl commented Apr 9, 2019

All the numbers below are using commit 0b40a46, since that's the last commit that could compile the codebases I've been using until an appropriate fix for issue #36516 is in place. However, that's only ~a month old (Mar 4), so the numbers shouldn't be too different on master. Below, if Checked is Yes, that means that dynamic checks were generated (so the experimental flag was not used).


Here's the numbers from running the first handful of N's Sass benchmarks 10 times each, min and max included with the mean to give you an idea of the result range:

Benchmark Snapshot Checked Minimum Mean Maximum
Small Plain Script Yes 0.18 0.196 0.21
" No 0.18 0.190 0.21
" App Yes 0.08 0.093 0.11
" No 0.08 0.091 0.11
Large Plain Script Yes 1.67 1.726 1.80
" No 1.64 1.691 1.76
" App Yes 1.67 1.737 1.79
" No 1.61 1.675 1.74
Preceding Sparse Script Yes 1.81 1.866 1.98
" No 1.74 1.783 1.83
" App Yes 1.80 1.849 1.93
" No 1.74 1.792 1.87
Following Sparse Script Yes 1.77 1.829 1.87
" No 1.69 1.746 1.85
" App Yes 1.78 1.827 1.92
" No 1.68 1.786 1.86
Preceding Dense Script Yes 2.66 2.717 2.79
" No 2.49 2.555 2.63
" App Yes 2.65 2.730 2.85
" No 2.50 2.597 2.71
Following Dense Script Yes 2.70 2.792 2.88
" No 2.60 2.664 2.81
" App Yes 2.74 2.851 2.90
" No 2.62 2.719 2.82

Here, the overhead of the strong type checks is ~5% on average, with the ranges on the checked vs. unchecked runs overlapping in almost all cases.


For the two large codebases I tried compiling with dart2js (first compiling dart2js to a kernel snapshot and using that for the benchmark times, each codebase compiled five times each):

Benchmark Checked Minimum Mean Maximum
Codebase 1 Yes 374.74 381.24 389.62
" No 369.80 375.96 380.14
Codebase 2 Yes 257.67 259.28 263.12
" No 252.28 256.96 260.94

Here, there's a very minor overhead (~1%) to the runtime strong type checks.


So while the dynamic checks do add a small bit of overhead, it seems much more reasonable than in prior testing.

@sstrickl
Copy link
Contributor

sstrickl commented Apr 9, 2019

Just to double-check, I've profiled the codebase 1 run both with and without checks, and made sure that the without checks run really was without checks. In the profile, the type testing stubs were around 1% of the profile overhead for the checked version, which matches approximately what I saw in the averages.

Thus, on the kinds of loads we were seeing here, the cost of dynamic type checks are no longer providing a significant percentage of overhead, and so I'm going to close this.

@sstrickl sstrickl closed this as completed Apr 9, 2019
@Stargator
Copy link
Contributor

@sstrickl How often are these kinds of performance tests done? Before each release or only when certain components are changed?

@mraleph
Copy link
Member Author

mraleph commented Apr 10, 2019

@Stargator we have a suite of performance tests which is automatically run for each commit to SDK repo in various configurations (native JIT/AOT, dart2js on V8 etc) on different hardware (different Intel variants, ARM32 and ARM64 hardware, mobile phones).

Due to technical complexities involved we don't track performance of some more real world scenarios there. However there is an expectation that performance on real world benchmarks is a function of performance on the benchmarks we are tracking.

In general I would recommend anybody using Dart to also have at least some performance tracking on their side - so that you can catch potential regressions between releases and report them back to us for fixing - if we miss something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. customer-dart-sass P2 A bug or feature request we're likely to work on
Projects
None yet
Development

No branches or pull requests