Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure extension RPC endpoints ready before processing gRPC messages #10255

Merged
merged 3 commits into from
Jul 11, 2024

Conversation

jviau
Copy link
Contributor

@jviau jviau commented Jun 27, 2024

Issue describing the changes in this PR

resolves #10251

Pull request checklist

IMPORTANT: Currently, changes must be backported to the in-proc branch to be included in Core Tools and non-Flex deployments.

  • Backporting to the in-proc branch is not required
    • Otherwise: Link to backporting PR -- TODO
  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests)

Additional information

It is believed there is a race condition on startup between the first gRPC message coming in and extension RPC endpoints being registered. While EndpointDataSource has a change token system, leading us to believe that we can update endpoints post startup, there is one of two possibilities occurring:

  1. Only a subset of messages fail due to a time gap between host/worker startup and the extension RPC endpoints being registered.
  2. OR AspNetCore locks-in the available endpoints when UseRouting middleware is first encountered. In which case this host instance would not have extension RPC endpoints for its lifetime.

This fix should address both of those possible scenarios, as routing middleware is only initialized on first call. By ensuring we collect extension endpoints before the first routing occurs we avoid the race condition.

RISK: there is a small risk this could introduce a deadlock. If there is some dependency existing, or later introduced, which requires RPC communication between host and worker before all extensions are loaded host-side, then this could cause a circular dependency and deadlock. I do not believe this is the case today, as testing has shown endpoints to be available immediately on startup before the first worker RPC message comes in.

@jviau jviau requested a review from a team as a code owner June 27, 2024 18:59
@jviau jviau force-pushed the jviau/grpc/fix-startup-race branch from 6f6d1ab to a55bf13 Compare June 27, 2024 19:16
@jviau
Copy link
Contributor Author

jviau commented Jun 28, 2024

/azp run

Copy link

Azure Pipelines successfully started running 3 pipeline(s).

@jviau jviau merged commit a3734f9 into dev Jul 11, 2024
10 checks passed
@jviau jviau deleted the jviau/grpc/fix-startup-race branch July 11, 2024 04:49
jviau added a commit that referenced this pull request Jul 11, 2024
…10255)

* Ensure extension RPC endpoints ready before processing gRPC messages

* Add timeout and tests to waiting on RPC extensions.
jviau added a commit that referenced this pull request Jul 15, 2024
… gRPC messages (#10282)

* Ensure extension RPC endpoints ready before processing gRPC messages (#10255)

* Ensure extension RPC endpoints ready before processing gRPC messages

* Add timeout and tests to waiting on RPC extensions.

* update release_notes.md
v-imohammad pushed a commit that referenced this pull request Jul 15, 2024
… gRPC messages (#10282)

* Ensure extension RPC endpoints ready before processing gRPC messages (#10255)

* Ensure extension RPC endpoints ready before processing gRPC messages

* Add timeout and tests to waiting on RPC extensions.

* update release_notes.md
v-imohammad added a commit that referenced this pull request Jul 16, 2024
* JIT Trace - 4.35.0 using in-proc as base (#10281)

* Adding a timeout to `IFunctionProvider.GetFunctionMetadataAsync`  (#10249) (#10273)

* Adding a timeout to IFunctionProvider.GetFunctionMetadataAsync so that if a provider does not return, it will not cause a deadlock state.

* Adding release notes.

* Added a new internal property `MetadataProviderTimeoutInSeconds` which can be set to a different value than default, from the tests.

* Added comment about property being settable.

* PR feedback fixes. Throws an error when the function metadata provider method throws or when the operation timeseout.

* [in-proc port] Ensure extension RPC endpoints ready before processing gRPC messages (#10282)

* Ensure extension RPC endpoints ready before processing gRPC messages (#10255)

* Ensure extension RPC endpoints ready before processing gRPC messages

* Add timeout and tests to waiting on RPC extensions.

* update release_notes.md

* [in-proc] Update Microsoft.Azure.WebJobs to 3.0.41 and Microsoft.Azure.WebJobs.Host.Storage to 5.0.1 (#10288)

* Upgraded the following package versions:
  - Microsoft.Azure.WebJobs updated to 3.0.41
  - Microsoft.Azure.WebJobs.Host.Storage updated to 5.0.1
  - Microsoft.Extensions.Azure updated to 1.7.1
  - Azure.Storage.Blobs updated to 12.19.1

* Updating deps.json for Tests.

---------

Co-authored-by: Shyju Krishnankutty <connectshyju@gmail.com>
Co-authored-by: Jacob Viau <javia@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Service bus message actions throwing Grpc.Core.RpcException when sending multiple requests
3 participants