Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(swarm): external address candidate only after address translation #4158

Merged
merged 3 commits into from
Jul 9, 2023

Conversation

b-zee
Copy link
Contributor

@b-zee b-zee commented Jul 5, 2023

The NewExternalAddrCandidate event is yielded both before and after address translation. This will
cause, in the case of TCP, ephemeral ports to be added as candidate.

In turn, that will cause protocols like AutoNAT to fail as these candidates are not actually
reachable/external.

Whilst technically not wrong, yielding candidates that are for sure invalid is mere noise. Thus, we only yield the candidate if address translation did not yield any new candidates.

Fixes #4153.

@b-zee b-zee force-pushed the fix-swarm-addr-candidate branch from 12c6f62 to 92df42b Compare July 5, 2023 14:43
@b-zee b-zee changed the title swarm: fix external address candidate Fix: external address candidate only after address translation Jul 5, 2023
@b-zee b-zee changed the title Fix: external address candidate only after address translation fix: external address candidate only after address translation Jul 5, 2023
@b-zee b-zee changed the title fix: external address candidate only after address translation fix(swarm): external address candidate only after address translation Jul 5, 2023
Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 thank you @b-zee for reporting, debugging and fixing!

Will give @thomaseizinger some time as well in case we are missing something.

In the meantime @b-zee mind adding a changelog entry? See https://github.com/libp2p/rust-libp2p/blob/master/docs/release.md#make-changelog-entries for details. That way I can cut a patch release once this pull request is merged.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this was actually on purpose but may have been a logic error!

First off, it should not be a problem to yield address candidates that are wrong. They are candidates for a reason!

Second, address translation will return None if the transport does not understand the address. Thus despite having listen addresses, this loop is not guaranteed to yield an event!

This will sound niche but consider the following:

  • We listen on TCP locally
  • We have some kind of overlay proxy network, like Tor
  • We emit an external addr candidate for the Tor address

If we merge this PR, that would never get emitted.

One could argue that Tor addresses are always external and so one should not emit a candidate?

Perhaps we should first do address translation and if that does not yield anything for whatever reason, just emit the untouched address?

@thomaseizinger
Copy link
Contributor

First off, it should not be a problem to yield address candidates that are wrong. They are candidates for a reason!

Reading your original issue, I'd say that in general, this isn't a bug. There may be all sorts of reasons why what we thought is an external address does not turn out to be one.

However, it does make sense to optimize our code so that bad guesses are avoided which is what this is doing :)

It does feel like not emitting the untouched address can lead to a different, subtle and more severe bug than emitting it so I am a bit torn.

I guess we can assume that address translation does not produce a garbage address? i.e. does the assumption hold that if we translate it, the translated candidate is always strictly better than the given one?

@b-zee
Copy link
Contributor Author

b-zee commented Jul 6, 2023

First off, it should not be a problem to yield address candidates that are wrong. They are candidates for a reason!

That is true. I have been looking at this from the perspective of AutoNAT. If I remember correctly, AutoNAT v2 will fix the problem that rises from this issue, because one failed dial will not disturb the rest of the results.

Perhaps we should first do address translation and if that does not yield anything for whatever reason, just emit the untouched address?

This sounds reasonable as far as I can tell.

@b-zee
Copy link
Contributor Author

b-zee commented Jul 6, 2023

I guess we can assume that address translation does not produce a garbage address? i.e. does the assumption hold that if we translate it, the translated candidate is always strictly better than the given one?

I think that 'address translation' should happen. If there is nothing to translate or the transport doesn't need to translate, then I expect the same address to come out. This was my assumption going into the PR. So yes, I think that sounds intuitive, the translated address(es) is always better than the input.

@thomaseizinger
Copy link
Contributor

Okay, thanks for thinking it through with me :)

Are you okay to adjust the PR to the proposed solution?

@b-zee b-zee force-pushed the fix-swarm-addr-candidate branch from 92df42b to 7323d9c Compare July 6, 2023 09:54
@b-zee
Copy link
Contributor Author

b-zee commented Jul 6, 2023

Made the changes. Let me know if this works for you! @thomaseizinger @mxinden

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One more thing to discuss.

swarm/src/lib.rs Show resolved Hide resolved
b-zee added 2 commits July 6, 2023 14:29
The `NewExternalAddrCandidate` event is yielded both before and after address translation. This will
cause, in the case of TCP, ephemeral ports to be added as candidate.

In turn, that will cause protocols like AutoNAT to fail as these candidates are not actually
reachable/external.

We will now only yield the original candidate if translation did not apply.

Fixes libp2p#4153
@b-zee b-zee force-pushed the fix-swarm-addr-candidate branch from 7323d9c to 21338e1 Compare July 6, 2023 12:29
@thomaseizinger
Copy link
Contributor

Thank you!

@mergify mergify bot merged commit b4c7dfc into libp2p:master Jul 9, 2023
@b-zee b-zee deleted the fix-swarm-addr-candidate branch July 10, 2023 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Identify behavior adds observed address, but contains ephemeral port causing AutoNAT probe failure
3 participants