Skip to content

Conversation

hugehoo
Copy link
Contributor

@hugehoo hugehoo commented Sep 11, 2025

Fixes: #8529

This PR fixes to increment metrics only when the stream is active which is found in the activeStreams map.

as-is

  • The deleteStream was incrementing channelz metrics every time it was called, even when stream was already removed from activeStreams or not exists in activeStreams.

to-be

  • Added check to ensure metrics are only incremented once when a stream is actually removed from activeStreams.

RELEASE NOTES: N/A

Copy link

codecov bot commented Sep 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.20%. Comparing base (8420f3f) to head (f701f5c).
⚠️ Report is 31 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8573      +/-   ##
==========================================
+ Coverage   80.91%   82.20%   +1.29%     
==========================================
  Files         413      415       +2     
  Lines       40751    40712      -39     
==========================================
+ Hits        32972    33467     +495     
+ Misses       6155     5877     -278     
+ Partials     1624     1368     -256     
Files with missing lines Coverage Δ
internal/transport/http2_server.go 90.88% <100.00%> (+<0.01%) ⬆️

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hugehoo hugehoo marked this pull request as ready for review September 11, 2025 17:41
@arjan-bal arjan-bal self-requested a review September 23, 2025 05:05
@arjan-bal arjan-bal self-assigned this Sep 23, 2025
@arjan-bal arjan-bal added this to the 1.77 Release milestone Sep 23, 2025
Copy link
Contributor

@arjan-bal arjan-bal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sending the PR, nice job writing a unit test for this!

@arjan-bal arjan-bal assigned hugehoo and unassigned arjan-bal Sep 23, 2025
@hugehoo hugehoo requested a review from arjan-bal September 29, 2025 16:11
@arjan-bal arjan-bal assigned arjan-bal and unassigned hugehoo Sep 29, 2025
Comment on lines 3309 to 3313
if test.removeFromActive {
serverTransport.mu.Lock()
delete(serverTransport.activeStreams, serverStream.id)
serverTransport.mu.Unlock()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not access the activeStreams map directly. We should rely on the first call to serverTransport.deleteStream to delete the stream. After the first call to serverTransport.deleteStream, we should assert that the metric counts are as expected.

After the first call to serverTransport.deleteStream, we can assume that the stream is deleted from the activeStreams map. We can call serverTransport.deleteStream a couple of times more and verify that the metric values haven't changed.

This would allow us to get rid of the removeFromActive test param since that case is already covered by the assertions made in the previous paragraph.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 3336 to 3343
if !test.removeFromActive {
serverTransport.mu.Lock()
_, exists := serverTransport.activeStreams[serverStream.id]
serverTransport.mu.Unlock()
if exists {
t.Error("Stream should have been removed from activeStreams")
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the assertions on the activeStreams map here. Ideally we should test the public API of a struct (methods on serverTransport) instead of it's internal details. In this case, testing the metric should give us enough confidence that the state of the map is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 3315 to 3316
initialSucceeded := serverTransport.channelz.SocketMetrics.StreamsSucceeded.Load()
initialFailed := serverTransport.channelz.SocketMetrics.StreamsFailed.Load()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values will always be 0, right? If so, we can assert they are 0 and assert that the absolute values of the metrics later instead of deltas. This would make it clearer for readers to see exactly what is the expected value of the metric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arjan-bal arjan-bal assigned hugehoo and unassigned arjan-bal Oct 3, 2025
@hugehoo
Copy link
Contributor Author

hugehoo commented Oct 3, 2025

@arjan-bal thx for the review, i updated additional comments

@hugehoo hugehoo requested a review from arjan-bal October 3, 2025 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

transport: Servers may emit extra channelz stream metrics on cancellation
2 participants