Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid infinite loop in CreateOffer #1657

Closed
wants to merge 1 commit into from

Conversation

jech
Copy link
Member

@jech jech commented Feb 2, 2021

Related to #1656.

@codecov
Copy link

codecov bot commented Feb 2, 2021

Codecov Report

Merging #1657 (723c598) into master (cddf631) will decrease coverage by 0.14%.
The diff coverage is 50.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1657      +/-   ##
==========================================
- Coverage   75.18%   75.04%   -0.15%     
==========================================
  Files          79       79              
  Lines        5694     5698       +4     
==========================================
- Hits         4281     4276       -5     
- Misses       1035     1041       +6     
- Partials      378      381       +3     
Flag Coverage Δ
go 76.62% <50.00%> (-0.16%) ⬇️
wasm 69.93% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
peerconnection.go 74.16% <50.00%> (-0.09%) ⬇️
operations.go 91.66% <0.00%> (-2.78%) ⬇️
dtlstransport.go 67.00% <0.00%> (-2.00%) ⬇️
icetransport.go 68.58% <0.00%> (-1.29%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cddf631...9a44cf9. Read the comment docs.

If the local description keeps getting changed, or in case of a but
in Pion, CreateOffer never terminates, which could cause client software
to hang.  Set an arbitrary bound on the number of iterations.
@jech jech force-pushed the avoid-infinite-loop-in-create-offer branch from a41a54e to 9a44cf9 Compare February 2, 2021 21:37
@Sean-Der
Copy link
Member

Sean-Der commented Feb 2, 2021

🔥 LGTM!

I will cherry-pick and include a test. Would also like to figure out the root cause

@jech
Copy link
Member Author

jech commented Feb 2, 2021

Would also like to figure out the root cause

Absolutely. With this patch, Galène no longer hangs, but it still fails streams when AddSender races with SetRemoteDescription.

@@ -576,6 +577,8 @@ func (pc *PeerConnection) hasLocalDescriptionChanged(desc *SessionDescription) b
return false
}

var errExcessiveRetries = errors.New("excessive retries in CreateOffer")
Copy link
Contributor

@OrlandoCo OrlandoCo Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give more context here, this message may not be helpful, maybe we can say that transceivers were modified during negotiation, or something that can help users to know that they are on a racing issue

Copy link
Member Author

@jech jech Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not happen in normal usage — if transceivers are being modified, then we'll only loop once or twice, while the error only appears after 128 iterations. If this message appears, it is indicative of a bug, either in the client application or in Pion.

In other words, this allows the application to recover after it hits a bug, rather than hanging and needing to be restarted.

So perhaps we could add something like "this is probably a bug somewhere", but I feel it's overkill.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have nerver seen this error in ion-sfu, but once I was trying to reuse transceiver I see it happen once, so I thought it was a racing issue related to mids that could not be found.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race I describe only happens when the client is slow sending its answer, so it's the kind of bug that you typically only see in production.

@Sean-Der
Copy link
Member

Sean-Der commented Feb 3, 2021

Merged with 1640767 thanks @jech

I also added a test

@Sean-Der Sean-Der closed this Feb 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants