Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adversarial stream client issues #237

Closed
jmacd opened this issue Jul 22, 2024 · 0 comments · Fixed by open-telemetry/opentelemetry-collector-contrib#34236
Closed

Adversarial stream client issues #237

jmacd opened this issue Jul 22, 2024 · 0 comments · Fixed by open-telemetry/opentelemetry-collector-contrib#34236

Comments

@jmacd
Copy link
Contributor

jmacd commented Jul 22, 2024

While testing in open-telemetry/opentelemetry-collector-contrib#34179, I discovered a case where the Arrow receiver can get "wedged" if the client is not calling Recv(). It is possible to arrange a test such that the receiver send loop is stuck in a call to Send() from which it will not exit on its own by writing a test that sends data without receiving batch status responses.

The data could be legitimately sent and followed by a CloseSend() before the stream max-age-grace. In this scenario, it seems the call to receiver Send() will not be unblocked by gRPC itself.

codeboten pushed a commit to open-telemetry/opentelemetry-collector-contrib that referenced this issue Jul 26, 2024
**Description:** Several test flakes were identified in
#34179.
This improves the tests and adds one new test. Adds one new
context-canceled return path, fixing a potential goroutine leak covered
by new testing.

**Link to tracking Issue:** Fixes
open-telemetry/otel-arrow#237

**Testing:** 

- For the test change in
`receiver/otelarrowreceiver/internal/arrow/arrow_test.go` note that the
test now allows out-of-order responses, which is explicitly a feature.
Out-of-order responses may have resulted from performance changes in
Arrow v17.
- For the context handling in
`receiver/otelarrowreceiver/internal/arrow/arrow.go` this is needed to
satisfy the HalfOpen test introduced here. A comment in the code
explains how previously gRPC stream resources could leak
- The ` TestOTelArrowShutdown` test improves readability. It had been
using a test helper designed originally for the OTLP (non-streaming)
test, which made it convoluted here. A sync.Once has been eliminated.
- The new test `TestOTelArrowHalfOpenShutdown` exercises the
"adversarial" condition where a client never calls `Recv()` on their
stream handle. The new context handling is required to pass the test w/o
goroutine leaks.

---------

Co-authored-by: Matthew Wear <matthew.wear@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant