Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in json.Register #7

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

MichaelSnowden
Copy link

What was changed

I changed the json.Register function to only invoke registrar.Register after all handlers have been added to the map.

Why?

It's possible for one of the handlers to be invoked after it has been registered with the registrar. Also, there's nothing stopping that from happening before we've finished writing to the handlers map, and the handler function reads from that same map, so this could lead to a data race.

Checklist

  1. Closes [Bug] Race: Concurrent map read/write, probably in server temporal#5123

  2. How was this tested:

You can see a justification for this here: temporalio/temporal#5123 (comment).
It's hard to reproduce this in temporal itself because it's a very rare bug, so I'm not going to try. The change itself is small, so it should be easy to see if it goes away.

  1. Any docs updates needed?

No.

Copy link

@dandavison dandavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM.

@MichaelSnowden MichaelSnowden merged commit bd4fb76 into dev Nov 16, 2023
@MichaelSnowden MichaelSnowden deleted the snowden/race-condition-fix branch November 16, 2023 02:23
MichaelSnowden added a commit to temporalio/temporal that referenced this pull request Nov 16, 2023
<!-- Describe what has changed in this PR -->
**What changed?**
This PR updates our version of tchannel to one that contains this
change: temporalio/tchannel-go#7.

<!-- Tell your future self why have you made these changes -->
**Why?**
This should fix a rare race condition that happens where a tchannel is
used while there are still handlers being added to it.

<!-- How have you verified this change? Tested locally? Added a unit
test? Checked in staging env? -->
**How did you test it?**
I didn't reproduce it actually, but it's a very small change, and the
race condition is pretty clear to see.

<!-- Assuming the worst case, what can be broken when deploying this
change to production? -->
**Potential risks**
The only other changes pulled in from this upgrade were changes to the
go.mod in tchannel-go, which look pretty innocuous:

```
-    golang.org/x/net v0.0.0-20211015210444-4f30a5c0130f
-    golang.org/x/sys v0.0.0-20211019181941-9d821ace8654
+    golang.org/x/net v0.7.0
+    golang.org/x/sys v0.5.0
```

<!-- Is this PR a hotfix candidate or require that a notification be
sent to the broader community? (Yes/No) -->
**Is hotfix candidate?**
No, it can wait. This only comes up super rarely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Race: Concurrent map read/write, probably in server
2 participants