Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Don't log expected errors #3016

Merged
merged 6 commits into from
Dec 11, 2024
Merged

chore: Don't log expected errors #3016

merged 6 commits into from
Dec 11, 2024

Conversation

matheus23
Copy link
Member

Description

I'm frequently seeing stuff like this in logs when running tests:

2024-12-06T11:33:18.547916Z ERROR iroh::magicsock: 515: failed to send udp: operation would block node=m4suaktjwmn62wiz dst=172.17.0.1:60377
2024-12-06T11:33:18.548017Z ERROR iroh::magicsock: 560: no UDP or relay paths available for node node=m4suaktjwmn62wiz

These two logs are totally non-fatal and actually expected.
The first one especially since try_send_upd can simply return io::ErrorKind::WouldBlock and that's totally normal.

So we avoid printing an error in that case.

I've also made the "no UDP or relay paths available for node" and another similar log debug level, as that's a state that we routinely encounter, is expected, and handled.

Notes & open questions

Just quickly noting that this log was in fact so annoying, I kept seeing the WouldBlock stuff a lot when running iroh-doctor with you on a train ride back then. It would mess up the whole terminal rendering 😵‍💫

Change checklist

  • Self-review.
  • [ ] Documentation updates following the style guide, if relevant.
  • [ ] Tests if relevant.
  • All breaking changes documented.

@matheus23 matheus23 self-assigned this Dec 6, 2024
Copy link

github-actions bot commented Dec 6, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3016/docs/iroh/

Last updated: 2024-12-11T13:44:21Z

Copy link

github-actions bot commented Dec 6, 2024

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 69ffa6f

@@ -563,16 +566,16 @@ impl MagicSock {
// at any time so these errors should be treated as transient and
// are just timeouts. Hence we opt for returning Ok. See
// test_try_send_no_udp_addr_or_relay_url to explore this further.
error!(
debug!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I find tricky to judge. I'm not entirely sure if this is a benign condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It happens every time I run a test.

I think the condition this happens in is for example: I create a new Endpoint with relays disabled.
Then I send data with the udp socket send eventually returning pending.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the code above:

  • udp_pending ends up being true
  • relay_pending is false, because relay_url is None
  • relay_sent and udp_sent are both false
  • pings_sent ends up being false too, maybe because there's just no pings scheduled to send.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, in that case is it easy to create an isolated reproducer for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the code above:

* `udp_pending` ends up being `true`

* `relay_pending` is false, because `relay_url` is `None`

* `relay_sent` and `udp_sent` are both `false`

* `pings_sent` ends up being `false` too, maybe because there's just no pings scheduled to send.

Nice!

So this means this is a bug: the socket should return WouldBlock in this case. Right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means this is a bug: the socket should return WouldBlock in this case. Right?

Not sure. There's another comment close by that says:

Returning WouldBlock triggers a hot loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise an out-of-band conversation: this can not return WouldBlock because the next datagram being sent may have a destination with a RelayUrl so if you would go into blocking here you'd be wrongly blocking.

I've been sufficiently convinced now that this really is a normal situation and that we can turn down the log level here.

Could you add a comment explaining when this happens though? Basically some version of #3016 (comment) maybe with the bit that you can't block because you can't predict the dest of the next datagram.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more comments. Lmk what you think :)

@dignifiedquire dignifiedquire added this to the v0.30.0 milestone Dec 9, 2024
Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@matheus23 matheus23 enabled auto-merge December 11, 2024 13:41
@matheus23 matheus23 added this pull request to the merge queue Dec 11, 2024
Merged via the queue into main with commit fdb687f Dec 11, 2024
26 checks passed
@matheus23 matheus23 deleted the matheus23/reduce-logging branch December 11, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants