Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(subscriber): do not report excessive polling (#378) #440

Merged
merged 6 commits into from
Jul 3, 2023

Conversation

MatteoNardi
Copy link
Contributor

When entering and exiting a span the old code was also updating the parent
stats. This was causing excessive polling being reported for the parent tasks.
See issue #378 for more details. The regression was introduced by the refactor
in #238.
This fixes the issue by limiting updates to the current span.

Closes #378

@MatteoNardi MatteoNardi requested a review from a team as a code owner June 25, 2023 21:10
@MatteoNardi
Copy link
Contributor Author

I still have some doubts regarding the original need for the parent update. Here is the research I've made, hopefully it could be useful when reviewing this.

Before #238 landed, on_enter sent id and parent_id to the aggregator:
https://github.com/tokio-rs/console/blob/console-subscriber-v0.1.0/console-subscriber/src/lib.rs#L725-L732
The aggregator would then increase the poll stats of id if it was a task and increase the poll stats of parent_id if it was an async op.
https://github.com/tokio-rs/console/blob/console-subscriber-v0.1.0/console-subscriber/src/aggregator/mod.rs#L630-L634
Notice how this is different both from the change in #238 (which increases the stats of both spans) and the code after this PR (which updates the task and async op stats of the current span). Before commit 1aa9b59, I think the behavior matched the one in this PR.

If I got it right, I think this is what the various branches were updating:

branch current task parent task current async op parent async op
this PR yes no yes no
main branch yes yes yes yes
before 238 yes no no yes
before 1aa9b59 yes no yes no

I'll need some help understanding the proper behavior :D

Copy link
Collaborator

@hds hds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some testing of this change myself and it seemed to work for all the cases I tried, so I believe your logic is correct.

I've got a couple of style comments on this change though.

console-subscriber/src/lib.rs Outdated Show resolved Hide resolved
console-subscriber/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@hds hds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of small changes for clarity.

console-subscriber/src/lib.rs Outdated Show resolved Hide resolved
console-subscriber/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@hds hds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you very much!

@hds
Copy link
Collaborator

hds commented Jul 3, 2023

@hawkw Would you mind taking a look at this one now that CI is green again? I'm pretty sure that this is the correct logic. What I'm not sure about is why the previous logic was there, which makes me think that I'm missing something important.

@hawkw
Copy link
Member

hawkw commented Jul 3, 2023

yeah, i've been meaning to give this one a closer look!

@hawkw
Copy link
Member

hawkw commented Jul 3, 2023

@hds

What I'm not sure about is why the previous logic was there, which makes me think that I'm missing something important.

After thinking about this for a bit, I think the previous logic is just...wrong. I believe I wrote it that way because I was blindly copying the behavior a tracing subscriber would do for tracking span timing, but spawned task spans are not the same as spans that execute inside of other spans. A spawned task shouldn't be incrementing its parent's active times, and I just wasn't thinking when I wrote that code. :)

I'm pretty sure this change is correct. It does remind me that we should really probably just change tokio to have parent: None for task spans...

@hds
Copy link
Collaborator

hds commented Jul 3, 2023

I'm pretty sure this change is correct. It does remind me that we should really probably just change tokio to have parent: None for task spans...

Yeah. We should. I have that issue for this, tokio-rs/tokio#5792.

@hawkw hawkw enabled auto-merge (squash) July 3, 2023 18:05
@hawkw hawkw merged commit 91929d0 into tokio-rs:main Jul 3, 2023
hawkw added a commit that referenced this pull request Sep 29, 2023
When entering and exiting a span the old code was also updating the parent
stats. This was causing excessive polling being reported for the parent tasks.
See issue #378 for more details. The regression was introduced by the refactor
in #238.
This fixes the issue by limiting updates to the current span.

Closes #378

Co-authored-by: Hayden Stainsby <hds@caffeineconcepts.com>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
hawkw added a commit that referenced this pull request Sep 29, 2023
# Changelog

All notable changes to this project will be documented in this file.
This project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## console-subscriber-v0.2.0 - (2023-09-29)

[0b0c1af](https://github.com/tokio-rs/console/commit/0b0c1aff18c3260d3a45a78f6c0d6f4206af1cbb)...[0b0c1af](https://github.com/tokio-rs/console/commit/0b0c1aff18c3260d3a45a78f6c0d6f4206af1cbb)

### <a id = "console-subscriber-v0.2.0-breaking"></a>Breaking Changes
- **Update Tonic and Prost dependencies ([#364](#364 ([f9b8e03](https://github.com/tokio-rs/console/commit/f9b8e03bd7ee1d0edb441c94a93a350d5b06ed3b))<br />This commit updates the public dependencies `prost` and `tonic` to
semver-incompatible versions (v0.11.0 and v0.8.0, respectively). This is
a breaking change for users who are integrating the `console-api` protos
with their own `tonic` servers or clients.
- **Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464 ([96e62c8](https://github.com/tokio-rs/console/commit/96e62c83ef959569bb062dc8fee98fa2b2461e8d))<br />This is a breaking change for users of `console-api` and
`console-subscriber`, as it changes the public `tonic` dependency to a
semver-incompatible version. This breaks compatibility with `tonic`
0.9.x and `prost` 0.11.x.

### Added

- [**breaking**](#console-subscriber-v0.2.0-breaking) Update Tonic and Prost dependencies ([#364](#364)) ([f9b8e03](f9b8e03))
- Add support for Unix domain sockets ([#388](#388)) ([a944dbc](a944dbc), closes [#296](#296))
- Add scheduled time per task ([#406](#406)) ([f280df9](f280df9))
- Add task scheduled times histogram ([#409](#409)) ([d92a399](d92a399))
- Update `tonic` to 0.9 ([#420](#420)) ([48af1ee](48af1ee))
- Update MSRV to Rust 1.60.0 ([b18ee47](b18ee47))
- Expose server parts ([#451](#451)) ([e51ac5a](e51ac5a))
- Add cfg `console_without_tokio_unstable` ([#446](#446)) ([7ed6673](7ed6673))
- Add warning for tasks that never yield ([#439](#439)) ([d05fa9e](d05fa9e))
- [**breaking**](#console-subscriber-v0.2.0-breaking) Update `tonic` to v0.10 and increase MSRV to 1.64 ([#464](#464)) ([96e62c8](96e62c8))

### Documented

- Fix unclosed code block ([#463](#463)) ([362bdbe](362bdbe))
- Update MSRV version docs to 1.64 ([#467](#467)) ([94a5a51](94a5a51))

### Fixed

- Fix build on tokio 1.21.0 ([#374](#374)) ([c34ac2d](c34ac2d))
- Fix off-by-one indexing for `callsites` ([#391](#391)) ([43891ab](43891ab))
- Bump minimum Tokio version ([#397](#397)) ([bbb8f25](bbb8f25), fixes [#386](#386))
- Fix self wakes count ([#430](#430)) ([d308935](d308935))
- Remove clock skew warning in `start_poll` ([#434](#434)) ([4a88b28](4a88b28))
- Do not report excessive polling ([#378](#378)) ([#440](#440)) ([8b483bf](8b483bf), closes [#378](#378))
- Correct retain logic ([#447](#447)) ([36ffc51](36ffc51))

Signed-off-by: Eliza Weisman <eliza@buoyant.io>
hds added a commit that referenced this pull request Oct 11, 2023
A defect in the way polls are counted was fixed in #440. The defect
caused the polls of a "child" task to also be counted on the "parent"
task. Where in this case, the "parent" is a task that spawns the "child"
task.

This change adds a regression test for that fix.

To do so, functionality to record and validate poll counts is added to
the `console-subscriber` integration test framework. The new
functionality is also used to add some basic tests for poll counts.
hawkw pushed a commit that referenced this pull request Nov 16, 2023
A defect in the way polls are counted was fixed in #440. The defect
caused the polls of a "child" task to also be counted on the "parent"
task. Where in this case, the "parent" is a task that spawns the "child"
task.

This change adds a regression test for that fix.

To do so, functionality to record and validate poll counts is added to
the `console-subscriber` integration test framework. The new
functionality is also used to add some basic tests for poll counts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Console reporting excessive polling, that isn't there
3 participants