Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix two spinlocks #44517

Merged
merged 1 commit into from
Mar 26, 2024
Merged

Fix two spinlocks #44517

merged 1 commit into from
Mar 26, 2024

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Mar 22, 2024

PR description:

Fix two spin locks that had bugs. One in LogErrorEventFilter and one in Path. Both could cause rare data races and crashes.

The one in LogErrorEventFilter was causing rare crashes in the IBs recently and that caused us to notice the problem. This is a minimal fix. We are going to discuss this more and possibly make further changes related to this in the future. See Issue #44413 for discussion.

The problem in Path has been there 4 years and no one has noticed or connected any problems to it. The problem in LogErrorEventFilter has been there 5 years and the problems were only recently noticed. I'm not sure if this is worth back porting or not. The problem should be occurring only very rarely...

In part the fix is a test. The problems are not reproducible so we are not 100% sure this fixes the problem. We plan to observe the IBs after this is merged and see if the recently noticed problems stop occurring.

PR validation:

Relies on existing tests.

@cmsbuild
Copy link
Contributor

cmsbuild commented Mar 22, 2024

cms-bot internal usage

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44517/39640

  • This PR adds an extra 28KB to repository

@wddgit
Copy link
Contributor Author

wddgit commented Mar 22, 2024

enable threading

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wddgit for master.

It involves the following packages:

  • DPGAnalysis/Skims (pdmv)
  • FWCore/Framework (core)

@Dr15Jones, @cmsbuild, @sunilUIET, @AdrianoDee, @smuzaffar, @miquork, @makortel can you please review it and eventually sign? Thanks.
@youyingli, @AnnikaStein, @makortel, @missirol this is something you requested to watch as well.
@rappoccio, @sextonkennedy, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Mar 22, 2024

please test with #44447

@wddgit
Copy link
Contributor Author

wddgit commented Mar 22, 2024

type bug

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3ea906/38353/summary.html
COMMIT: f25ce3b
CMSSW: CMSSW_14_1_X_2024-03-22-1100/el8_amd64_gcc12
Additional Tests: THREADING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/44517/38353/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@makortel
Copy link
Contributor

@makortel
Copy link
Contributor

+core

@makortel
Copy link
Contributor

The problem in Path has been there 4 years and no one has noticed or connected any problems to it. The problem in LogErrorEventFilter has been there 5 years and the problems were only recently noticed. I'm not sure if this is worth back porting or not. The problem should be occurring only very rarely...

My feeling is that #43522 somehow caused globalEndLumi transitions to be actually run concurrently. This (and #44447) could be worth of backporting nevertheless (or, doing the backports could be cheaper than trying to fully understand why this situation didn't happen before). The fixed code are wrong in any case.

@antoniovilela
Copy link
Contributor

+1

@AdrianoDee
Copy link
Contributor

+pdmv

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@cmsbuild cmsbuild merged commit db4e866 into cms-sw:master Mar 26, 2024
12 checks passed
@wddgit wddgit deleted the fixSpinLocks branch October 28, 2024 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants