Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression tests #388

Merged
merged 34 commits into from
Mar 15, 2023
Merged

Conversation

termi-official
Copy link
Member

@termi-official termi-official commented Oct 12, 2021

Introduce performance regression benchmarks through PkgBenchmark.jl on the most important pieces of Ferrite. Resolves #94 .

Benchmark Coverage

  • Common Lagrange assembly loops
  • Available hyperrectangular mesh generators
  • close!(dh) for the most common Lagrange problems (DofHandler + MixedDofHandler)
  • Application of Dirichlet BCs with both available strategies

What is missing

  • CI integration: Doing this should be addressed by a follow up PR. Currently a problem is that we cannot select subsets of the benchmark suite and running the complete suite really takes some time. Integration can be achieved via BenchmarkCI.jl
  • Micobenchmarks for utils: We have a plethora of utility functions which are not covered yet. We should at least cover non-lookup based utilities in future benchmarks.
  • Embedded elements in 2D and 3D due to missing capabilities

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2021

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (378f4ee) 92.97% compared to head (f7d9bdf) 92.97%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #388   +/-   ##
=======================================
  Coverage   92.97%   92.97%           
=======================================
  Files          28       28           
  Lines        4270     4270           
=======================================
  Hits         3970     3970           
  Misses        300      300           
Impacted Files Coverage Δ
src/Dofs/MixedDofHandler.jl 86.09% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@koehlerson koehlerson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, maybe add also an assembly where not the CellIterator is used, but rather

for cellid in 1:getncells(grid)
    # assemble code
    assemble!(assembler, celldofs(dh,cellid), Me, fe)
end

benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
benchmark/benchmarks.jl Outdated Show resolved Hide resolved
@termi-official
Copy link
Member Author

@fredrikekre is this what you had in mind?

@fredrikekre
Copy link
Member

Nice. Is there an easy way to run this on two different versions or something?

@termi-official
Copy link
Member Author

It should be as simple as calling PkgBenchmark.judge with the commit you want to compare against. I don't know if this works properly across branches, but if I understand correctly this is done by https://github.com/tkf/BenchmarkCI.jl via BenchmarkCI.judge (which compares the performance of the current branch head against the master).

However be careful, the suite currently takes some time. 1h tuning and I have not run the benchmark further than tuning yet. Benchmarking close! takes quite some effort and I haven't found a way to simplify this one yet.

@KristofferC
Copy link
Collaborator

KristofferC commented Oct 13, 2021

1h tuning and I have not run the benchmark further than tuning yet.

Yikes, that seems a bit much. I think this is perhaps a little bit excessive to try all different combinations of dimensions etc. They all run pretty much the same code so it seems quite unlikely that one of them will randomly be super slow.

Very slow thorough benchmarks are worse than fast not-so thorough benchmarks because slow benchmarks will never be run.

@fredrikekre
Copy link
Member

Yea, would be good to reduce that I think, but perhaps nice to still have the option to run everything maybe.

@termi-official
Copy link
Member Author

I agree on most points and I think having the option to run just some benchmarks would certainly help, but this feature is not yet implemented in PkgBenchmark (see JuliaCI/PkgBenchmark.jl#132).

The rationale behind this was to provide infrastructure for possible new dispatches on close! and benchmarks with different dof handlers/grid types/.... I could reduce the loops to just include something like "2^dim element" grid in 3d with field-dim = spatial-dim (e.g. commonly incompressible elasticity). I would leave the current loops commented out such that we can reuse them once the feature is implemented in PkgBenchmark.jl

@termi-official
Copy link
Member Author

termi-official commented Oct 13, 2021

Workload is now 8 minutes for tuning and 5 minutes for the actual benchmark. Not sure about the output tho. The vector-Laplacian assembly is sometimes faster than the scalar Laplacian. Nvm.

@KristofferC
Copy link
Collaborator

Workload is now 8 minutes for tuning and 5 minutes for the actual benchmark

That feel pretty ok.

@termi-official
Copy link
Member Author

I think this is one ready now. Can't fix the issue to run subsets of the test with this PR, because this is an issue on the PkgBenchmark backend interface. Will suggest a fix in this package in the future.

@termi-official
Copy link
Member Author

I took a look into implementing running subsets of the benchmark but cannot find a good angle on this. Problem is consistent integration with the tuning procedure. Nonetheless, I think this implementation is helpful to compare different approaches and to catch performance regressions, especially since we currently refactor a bit and add quite a bit new features. I think I can just add two additional CI jobs to cover such benchmarks (for nightly and LTS) via BenchmarkCI.jl.

Project.toml Outdated Show resolved Hide resolved
@fredrikekre
Copy link
Member

Would be good with some docs on how to run, for example how to compare a PR with master. A make target in a benchmarks/Makefile that does this would be nice to have.

@termi-official
Copy link
Member Author

I added a makefile for the most common tasks and some docs. Also, I manually split up the benchmarks into small groups which can be executed individually for more fine-grained control (since PkgBenchmark.jl does not have such a feature).

docs/src/index.md Outdated Show resolved Hide resolved
@fredrikekre fredrikekre force-pushed the do/performance-regression-tests branch from a20c2fd to fc0fd4f Compare March 15, 2023 14:23
@fredrikekre fredrikekre merged commit 200b47a into master Mar 15, 2023
@fredrikekre fredrikekre deleted the do/performance-regression-tests branch March 15, 2023 16:32
@termi-official termi-official mentioned this pull request Apr 12, 2023
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmarking
5 participants