-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving performance of DGMulti flux differencing #757
Conversation
First step: optimize flux differencing using dense SBP matrices. Old timing:
After optimization of
|
Codecov Report
@@ Coverage Diff @@
## main #757 +/- ##
=======================================
Coverage 93.58% 93.59%
=======================================
Files 182 182
Lines 17644 17710 +66
=======================================
+ Hits 16512 16574 +62
- Misses 1132 1136 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
originally introduced as an optimized version of the routine as suggested in https://github.com/trixi-framework/Trixi.jl/pull/695/files#r670403097, but it doesn't seem to affect performance
Profiling results: shaved off 15% runtime by removing a bad broadcast. Other main causes of slowdown appear to be
|
Depending on your implementation of flux differencing, JuliaArrays/StaticArrays.jl#949 can also speed-up your code. |
After more optimization, the time for KHI has dropped to calc_volume_integral! 966 1.26s 79.9% 1.31ms However, the timing for KHI with volume integral 986 206ms 54.5% 209μs I think I can get a bit more speedup by not using LazyArrays (see JuliaArrays/LazyArrays.jl#189), but it's definitely not the only thing needed to close the performance gap. |
Aha! Tried using volume integral 986 1.47s 75.3% 1.49ms I see an odd performance Heisenbug related to whether or not I'm using PtrArray in |
Which KHI setup are you using? Is it on a uniform grid or do you use some nonconforming interfaces? |
Just a uniform |
Are you running Julia with |
I don't believe so - are you wondering about the performance Heisenbug? |
No, just about the general performance difference. When we benchmark Trixi.jl, we usually run Julia with |
…c/optimize_fluxdiff
Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>
…c/optimize_fluxdiff
intended to be compared with examples/dgmulti_2d/elixir_euler_kelvin_helmholtz_instability.jl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good. I just have a few minor comments and questions 👍
examples/tree_2d_dgsem/elixir_euler_kelvin_helmholtz_instability_no_shock_capturing.jl
Outdated
Show resolved
Hide resolved
Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>
Co-authored-by: Hendrik Ranocha <ranocha@users.noreply.github.com>
…c/optimize_fluxdiff
- using Matrix{SVector{nvars, uEltype}}, which seems to be a little bit faster on average.
Oops - test failing b/c I still need to specialize the analysis routines for the new solution storage. |
- also introduce new types for specialization for mul_by!(A::UniformScaling) and mul_by_accum!(A::UniformScaling)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot - nice work 👍
Thanks for reviewing! |
This PR will improve performance of flux differencing for DGMulti solvers.