Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(iroh-net)!: DiscoveredDirectAddrs need to update the timestamp #2808

Merged
merged 17 commits into from
Oct 18, 2024

Conversation

flub
Copy link
Contributor

@flub flub commented Oct 16, 2024

Description

DiscoverdDirectAddrs has a timestamp of when it was last changed, which is used to decide if the direct addr information is recent enough to use in call-me-maybe messages. However when it was updated this timestamp was not included in the comparison to see if there were changes, so that watchers would not get updates when the direct addresses did not change. Unfortunately this meant the timestamp also never updated and the DiscoveredDirectAddrs was always considered out of date (unless there were changes in the discovered addresses).

This fixes this by putting only the DirectAddrs in the watcher.

It does a few other improvements:

  • Moves all logic about DiscoveredDirectAddrs into one place. This could be split out to a module now if desired.

  • Tidies up the code which creates the DirectAddrs from the portmapper, netcheck and the local addresses. Again this is just a first pass in cleaning up this code, there is so much more that could be done.

Breaking Changes

  • iroh_net::Endpoint::direct_addresses now returns a stream yielding Items of BTreeSet<DirectAddr> instead of Vec<DirectAddr>.
  • iroh_base::node_addr::NodeAddr::from_parts now takes a impl IntoIterator<Item = DirectAddr> instead of Vec<DirectAddr>. But since Vec implements IntoIterator that is not really a breaking change.

Notes & open questions

The crux of this fix is moving from Watchable<DiscoveredDirectAddresses> to DiscoveredDirectAddresses::addrs being Watchable<Vec<DirectAddr>> instead of the whole struct being watchable.

I could split the refactoring of store_direct_addr_update and the renaming of it's related functions into a separate PR in front of this bugfix if desired. I had to change that code because of the format change from Vec<DirectAddrs> to BTreeSet<DirectAddr> and it was just way clearer doing it with the refactoring.

As indicated in the (already existing) comments that whole direct address updating logic can still be more refactored and improved. But many small steps is how we improve.

I went very opinionated on the imports... apologies.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

DiscoverdDirectAddrs has a timestamp of when it was last changed,
which is used to decide if the direct addr information is recent
enough to use in call-me-maybe messages.  However when it was updated
this timestamp was not included in the comparison to see if there were
changes, so that watchers would not get updates when the direct
addresses did not change.  Unfortunately this meant the timestamp also
never updated and the DiscoveredDirectAddrs was always considered out
of date (unless there were changes in the discovered addresses).

This fixes this by putting only the DirectAddrs in the watcher.

It does a few other improvements:

- Moves all logic about DiscoveredDirectAddrs into one place.  This
  could be split out to a module now if desired.

- Tiedies up the code which creates the DirectAddrs from the
  portmapper, netcheck and the local addresses.  Again this is just a
  first pass in cleaning up this code, there is so much more that
  could be done.
Copy link

github-actions bot commented Oct 16, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2808/docs/iroh/

Last updated: 2024-10-18T17:00:58Z

flub added 2 commits October 16, 2024 11:36
insert overwrites the existing entries in the map, using the entry API
we can avoid this.
(bonus, rustfmt formats the code more compactly!)
@flub flub marked this pull request as ready for review October 16, 2024 09:51
Copy link

github-actions bot commented Oct 16, 2024

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 820afe8

Copy link
Contributor

@ramfox ramfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me!

There are some weird cross comp failing tests, but that doesn't look like it started with this PR.

iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
@flub flub requested a review from matheus23 October 17, 2024 09:23
Copy link
Member

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About imports:
We could/perhaps should add a .rustfmt.toml file configured to your personal import style. (Let's also talk about it with the team first ;) )
Specifically, there's the imports_granularity and group_imports settings.

edition = "2021"
imports_granularity = "Module"
group_imports = "StdExternalCrate"

iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
iroh-net/src/magicsock.rs Show resolved Hide resolved
@flub flub changed the title fix(iroh-net): DiscoveredDirectAddrs need to update the timestamp fix(iroh-net)!: DiscoveredDirectAddrs need to update the timestamp Oct 17, 2024
@flub
Copy link
Contributor Author

flub commented Oct 17, 2024

About imports: We could/perhaps should add a .rustfmt.toml file configured to your personal import style. (Let's also talk about it with the team first ;) ) Specifically, there's the imports_granularity and group_imports settings.

edition = "2021"
imports_granularity = "Module"
group_imports = "StdExternalCrate"

I would love this! But they're not on stable rust.

Copy link
Contributor

@divagant-martian divagant-martian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To accelerate the process, sending an incomplete review so that you can act on this.

Also, sorry to be so pedantic, but a good chunk of the changes are unrelated to the pr title and have diverted into conversations that should not keep these changes from going through, like import formatting. Would you mind splitting this into its own PR?

iroh-base/src/node_addr.rs Outdated Show resolved Hide resolved
iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
iroh-net/src/magicsock.rs Outdated Show resolved Hide resolved
let direct_addrs = self.direct_addrs.read();
if direct_addrs.fresh_enough() {
let msg = direct_addrs.to_call_me_maybe_message();
if self.direct_addrs.fresh_enough() {
Copy link
Contributor

@divagant-martian divagant-martian Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this changes the semantics. Before, this method worked on one copy of the direct addresses, now there is a read access per call which can potentially lead to different values. So, tldr I think this is racy. Or at least seen only from the perspective of this function. If there is some magic above preventing multiple accesses to this, in particular writes, I'm not sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, I don't think this should rely on magic outside of the function so I'm with you with ignoring what happens in the callers.

Looking at the code however I think the only thing which is racy is the debug log of last_refresh_ago. If someone updates the direct addresses after the fresh_enough call says it's fresh enough... it still is fresh enough. So I don't think calling to_call_me_maybe_messages separate from calling fresh_enough is a problem and am tempted to leave that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the API of fresh_enough to return Result<(), Duration> to fix this. Also gets rid of the horrible last_fresh_ago method 😄

I appreciate this is not the best code. I my current thinking to do this thoroughly is to split this whole direct address management into an actor and have that actor just keep it's state (splitting off all that state from the magicsock will already be grand!) and refresh this stuff. Whenever is has new direct addresses it should send them to the magicsock so it can store those for access on the hot path. that actor itself can basically supervise itself: either it continues to run and provides addresses or it panics and brings down the process - i dont' think there needs to be an in-between (bear in mind "fresh enough" is 27 seconds).

(I also think that the interfaces information should be in an actor that behaves somewhat similar, but that's very tangential to this.)

@flub flub requested a review from divagant-martian October 18, 2024 11:02
Copy link
Contributor

@divagant-martian divagant-martian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}
}
}

// If a socket is bound to a specific address, add it.
Copy link
Contributor

@divagant-martian divagant-martian Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice comma 💪

@divagant-martian divagant-martian added this pull request to the merge queue Oct 18, 2024
Merged via the queue into main with commit 85bd8b7 Oct 18, 2024
26 of 27 checks passed
@divagant-martian divagant-martian deleted the flub/direct-addrs-updated-at branch October 18, 2024 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants