Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt cache thrashing fix for low synapse count vector access #3113

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

otcathatsya
Copy link
Contributor

@otcathatsya otcathatsya commented Feb 24, 2024

  • This is an attempt to fix an observed niche issue where, on a C++ level, connection creation will be slowed down by the num_connections_ vector accessing inner indices on the same cache line and therefore invalidating memory across threads, usually when there is only a few synapse models (e.g. when running with -Dwith-modelset=iaf_minimal)
  • Occurs mostly commonly with jemalloc:

Allocations are packed tightly together, which can be an issue for multi-threaded applications. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating.

  • Very much experimental, this fixes the time_construction_connect difference in SLI tests but makes no difference on a Python level (neither did the problem occur when timing from Python to begin with. Too insignificant?)
  • This does increase memory consumption up to a certain point (the L1 cache line usually is 64 bytes), but stays below full modelset levels; it is actually lower when running the full modelset with this alignment.
  • Maybe the destructive interference constexprs can be useful for something else further down the line
  • If this does get considered it should probably be amended to include -Wno-interference-size to suppress the compile warnings about this value varying between platforms, or be an optional CMake flag altogether
  • To-do: find a way to apply the destructive interference offset to the outer vector for more memory savings

@otcathatsya otcathatsya added T: Discussion Still searching for the right way to proceed / suggestions welcome S: Normal Handle this with default priority I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) labels Feb 24, 2024
Copy link

Pull request automatically marked stale!

@github-actions github-actions bot added the stale Automatic marker for inactivity, please have another look here label Apr 26, 2024
@terhorstd
Copy link
Contributor

@suku248 and @heplesser want to be in the loop for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: Normal Handle this with default priority stale Automatic marker for inactivity, please have another look here T: Discussion Still searching for the right way to proceed / suggestions welcome
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants