Fix possible race condition on request ctx done #1662

peczenyj · 2023-11-16T10:55:05Z

Hello

Few days ago I find this tweet about fasthttp and go-fiber:

https://twitter.com/davidnix_/status/1720454052973044188?s=48

I try to contact davidnix_ to find more information about this, since I use fasthttp on my projects and race condtion is something that I'd like to avoid in a production environment.

He did not provide me any runnable example, I think he delete or lost the original code, but I ask him to add an issue anyway - since I try to reproduce this issue without success.

Based on the screenshots, I think I find the root cause: we assign the done channel to nil and if we are using it in some other goroutine such as using Done() method, this may trigger the race condition, but I can't write a unit test that fails with or without the race condition detector.

If @DavidNix himself wants to add some contribution I think it will be helpful

Regards

…ot assign it to nil in order to avoid a race condition

DavidNix · 2023-11-16T13:12:26Z

The data race isn't with closing the channel. IIRC, it's with a struct field that holds a reference to a custom context.Context. I'll try to reproduce it in a few days.

IMO storing a context in a struct field is a design smell and discouraged.

From https://pkg.go.dev/context, they explicitly warn against it:

Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it.

peczenyj · 2023-11-16T13:24:36Z

The data race isn't with closing the channel.

No but after closing the channel, it was set to nil to prevent closing it again I think.

This is one possible source of race conditions.

if you look at the screenshot, there is a write on server.go line 1916 and read on server.go line 2728

by adding an once function, we can stop the original writing by never set this to nil.

however, this is not the end of story, and this is more deeply, linked to design itself, etc. Perhaps the adaptation of *fasthttp.RequestCtx to be a context.Context had some unexpected
side effects, etc.

peczenyj · 2023-11-16T17:53:33Z

Maybe this Fix #1663

server.go

erikdubbelboer · 2023-11-27T09:12:47Z

server.go

-		close(s.done)
+		s.closeDone.Do(func() {
+			close(s.done)
+		})


You have to reset s.closeDone when s.done is reset. This change now makes it impossible to reuse the Server struct after closing it.

oh I see, and since you want to reuse the Server struct, you may create the s.done again

I must think about this, because if I reset s.closeDone we may ended with the same code and same behaviour

I don't mean here exactly. You have to reset it when a Serve is called I think. That way you don't have the same behaviour anymore, only when you reuse the Server struct and then it's intended.

@erikdubbelboer @peczenyj I think it can be modified like this

func (ctx *RequestCtx) Done() <-chan struct{} { //fix Use locks to prevent concurrent modifications, //and use new variables to prevent panic caused by modifying the original done chan to nil. ctx.s.mu.Lock() defer ctx.s.mu.Unlock() if ctx.s.done == nil { tmp := make(chan struct{}, 1) tmp <- struct{}{} return tmp } doneChan := ctx.s.done return doneChan } func (ctx *RequestCtx) Err() error { select { case <-ctx.Done(): // //fix Use unified functions instead of reference variables to converge fetching into one place return context.Canceled default: return nil } }

MaxBreida · 2023-12-11T14:09:43Z

Hey, we also receive sometimes a panic in our services with the following stacktrace, could this be related to this issue?

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x560 pc=0xa5d08d]
                                                                                                                                                                                                                                                                                                                                                                      
goroutine 5095 [running]:
github.com/valyala/fasthttp.(*RequestCtx).Done(...)
    /home/vsts/go/pkg/mod/github.com/valyala/fasthttp@v1.50.0/server.go:2728
context.(*cancelCtx).propagateCancel.func2()
    /opt/hostedtoolcache/go/1.21.4/x64/src/context/context.go:506 +0x44
created by context.(*cancelCtx).propagateCancel in goroutine 3836
    /opt/hostedtoolcache/go/1.21.4/x64/src/context/context.go:504 +0x395

nickajacks1 · 2024-01-02T18:21:19Z

I understand that the channel may be set to nil to prevent a panic, but is it really valid to call Shutdown twice in the first place? Couldn't it instead be stated that users shall not call Shutdown more than once? Then the race could be avoided.

tylitianrui · 2024-01-10T15:17:46Z

server.go

-	done chan struct{}
+
+	done      chan struct{}
+	closeDone sync.Once


I think atomic is better than sync.Once

How should I use atomic in this context? Like to store the state (open / closed)?

* Fix possible race condition on request ctx done #1662 * Fix possible race condition on request ctx done #1662 * Fix Comment * fix remove the use of lock * fix remove Comment

peczenyj added 3 commits November 16, 2023 11:44

add OnceFunc for go 1.20

04dd294

make sure that we will close the done channel just once and we will n…

9243a92

…ot assign it to nil in order to avoid a race condition

fix build tags

7ed18bf

peczenyj changed the title ~~Fix race condition on request ctx done~~ Fix possible race condition on request ctx done Nov 16, 2023

erikdubbelboer requested changes Nov 24, 2023

View reviewed changes

server.go Outdated Show resolved Hide resolved

peczenyj added 2 commits November 26, 2023 14:43

simplify code using sync.Once

df2c2d2

remove unused code

1c39ad1

peczenyj requested a review from erikdubbelboer November 26, 2023 13:46

erikdubbelboer requested changes Nov 27, 2023

View reviewed changes

tylitianrui reviewed Jan 10, 2024

View reviewed changes

Merge branch 'master' into fix-race-condition-on-request-ctx-done

219c3e6

byte0o added a commit to byte0o/fasthttp that referenced this pull request Jul 16, 2024

Fix possible race condition on request ctx done valyala#1662

d66aec7

byte0o mentioned this pull request Jul 16, 2024

Fix possible race condition on request ctx done #1662 #1806

Merged

byte0o added a commit to byte0o/fasthttp that referenced this pull request Jul 16, 2024

Fix possible race condition on request ctx done valyala#1662

6c1d69d

erikdubbelboer closed this Jul 23, 2024

byte0o mentioned this pull request Oct 9, 2024

Context is canceled on a newly initialized fasthttp.RequestCtx in 1.56.0 #1879

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix possible race condition on request ctx done #1662

Fix possible race condition on request ctx done #1662

peczenyj commented Nov 16, 2023 •

edited

Loading

DavidNix commented Nov 16, 2023 •

edited

Loading

peczenyj commented Nov 16, 2023

peczenyj commented Nov 16, 2023

erikdubbelboer Nov 27, 2023

peczenyj Nov 27, 2023

erikdubbelboer Dec 13, 2023

byte0o Jun 22, 2024 •

edited

Loading

MaxBreida commented Dec 11, 2023

nickajacks1 commented Jan 2, 2024

tylitianrui Jan 10, 2024

peczenyj Jun 2, 2024

Fix possible race condition on request ctx done #1662

Fix possible race condition on request ctx done #1662

Conversation

peczenyj commented Nov 16, 2023 • edited Loading

DavidNix commented Nov 16, 2023 • edited Loading

peczenyj commented Nov 16, 2023

peczenyj commented Nov 16, 2023

erikdubbelboer Nov 27, 2023

Choose a reason for hiding this comment

peczenyj Nov 27, 2023

Choose a reason for hiding this comment

erikdubbelboer Dec 13, 2023

Choose a reason for hiding this comment

byte0o Jun 22, 2024 • edited Loading

Choose a reason for hiding this comment

MaxBreida commented Dec 11, 2023

nickajacks1 commented Jan 2, 2024

tylitianrui Jan 10, 2024

Choose a reason for hiding this comment

peczenyj Jun 2, 2024

Choose a reason for hiding this comment

peczenyj commented Nov 16, 2023 •

edited

Loading

DavidNix commented Nov 16, 2023 •

edited

Loading

byte0o Jun 22, 2024 •

edited

Loading