-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(p2p): dnsaddr recursive resolution #2204
Conversation
crates/services/p2p/Cargo.toml
Outdated
@@ -50,6 +50,7 @@ thiserror = "1.0.47" | |||
tokio = { workspace = true, features = ["sync"] } | |||
tracing = { workspace = true } | |||
void = "1" | |||
hickory-resolver = "0.25.0-alpha.2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use resolver from libp2p
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean the same version as libp2p or to use their resolver?
their resolver is in libp2p-dns (https://docs.rs/libp2p-dns/latest/libp2p_dns/tokio/type.Transport.html) and doesn't export any helper functions to resolve dnsaddr's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, they use the same library to do it. Yeah. we need to use the same version to not bloat the Cargo.lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 4f4b662
let mut dnsaddr_multiaddrs = vec![]; | ||
|
||
for dnsaddr in dnsaddr_urls { | ||
let multiaddrs = dns_resolver.lookup_dnsaddr(dnsaddr.as_ref()).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if dnsaddr
will have another dnsaddr
, will it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, resolution is recursive. the test case handles that :)
const MAX_DNS_LOOKUPS: usize = 10; | ||
|
||
#[async_trait::async_trait] | ||
pub trait DnsLookup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed :)
@@ -97,7 +102,7 @@ impl Config { | |||
self | |||
} | |||
|
|||
pub fn finish(self) -> Behaviour { | |||
pub async fn finish(self) -> anyhow::Result<Behaviour> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, have you considered implementation that doesn't require async
during construction? Maybe it is possible to resolved addresses in the fn start
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean here -
pub async fn start(&mut self) -> anyhow::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 5b4671f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate to block this PR, but there are two pretty substantial issues I can see so far that I would like to have addressed.
- I don't think we should use
.now_or_never()
to assert thatconfig.finish().await
is ready inbuild_behavior_fn
. It is not clear to me that this future will always be ready when this is invoked. - The DNS resolution tests should not be dependent on the host machines local DNS cache or network connection. Instead I'd suggest creating a port for this, which would allow us to create more tests for different DNS configurations.
Moreover, if a port is introduced to allow the DNS lookup logic to be tested I'd be interested in seeing a test case covering recursive lookups to make sure they work as expected.
crates/services/p2p/src/discovery.rs
Outdated
async move { config.finish().await } | ||
.now_or_never() | ||
.unwrap() | ||
.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how we can guarantee this future to be immediately ready. What happens if the DnsResolver::new().await?
is pending for example? As I read this, we'd panic in that scenario. That feels quite brittle to me, in which case it would feel safer to just use the synchronous resolver instead.
My preferred option though would be to allow build_behavior_fn
to return a function which returns a future, so we can return this future without having to panic if it isn't ready here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deprecated in 5b4671f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thank you!
let dnsaddr_urls = multiaddrs | ||
.iter() | ||
.filter_map(|node| { | ||
if let Protocol::Dnsaddr(multiaddr) = node.iter().next()? { | ||
Some(multiaddr.clone()) | ||
} else { | ||
None | ||
} | ||
}) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This reads a bit funky to me. Wouldn't the singular of multiaddrs
be multiaddr
rather than node
? I'd suggest this naming instead:
let dnsaddr_urls = multiaddrs | |
.iter() | |
.filter_map(|node| { | |
if let Protocol::Dnsaddr(multiaddr) = node.iter().next()? { | |
Some(multiaddr.clone()) | |
} else { | |
None | |
} | |
}) | |
.collect::<Vec<_>>(); | |
let dnsaddr_urls = multiaddrs | |
.iter() | |
.filter_map(|multiaddr| { | |
if let Protocol::Dnsaddr(dnsaddr_url) = multiaddr.iter().next()? { | |
Some(dnsaddr_url.clone()) | |
} else { | |
None | |
} | |
}) | |
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 0482f80
/// This limit is for preventing malicious or misconfigured DNS records from causing infinite recursion. | ||
const MAX_DNS_LOOKUPS: usize = 10; | ||
|
||
#[async_trait::async_trait] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Async traits have been stabilized since rust 1.75, so I'd prefer not to add this declaration for new traits. Instead just have the functions return impl Future<Output = ...>
and use async fn
in the implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed the trait
) -> Pin< | ||
Box<dyn std::future::Future<Output = anyhow::Result<Vec<Multiaddr>>> + Send + 'a>, | ||
> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do the Pin<Box<...>>
here? That feels like an artifact of async_trait
which should preferably be removed (as per my other suggestion), or if you want to keep the async_trait
, this should happen in the trait implementation and not in this private helper method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recursion needs us to pin the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, makes sense 👍
// given | ||
let resolver = DnsResolver::new().await.unwrap(); | ||
// when | ||
let multiaddrs = resolver | ||
.lookup_dnsaddr("bootstrap.libp2p.io") | ||
.await | ||
.unwrap(); | ||
// then | ||
assert!(!multiaddrs.is_empty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we assume here that any machine running this test will always be able to look up the "bootstrap.libp2p.io" address in their environment?
For example, what happens if I'm flushing my local DNS cache and disconnect from the internet before running this test? As I read this, the test case would fail in that scenario.
I think we need to introduce a a port to manage the DNS lookup to make this logic testable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think introduction of the port will make this tests useless, since it will just test how port itself works. Plus I'm not sure how hard is it I think the introduction of the port will make these tests useless since they will just test how the port itself works. Plus I'm not sure how hard it is to write your own DNS resolver.
I'm okay with the idea that this test fails without a connection or local DNS resolution.
Having a test with real DNS proves that it works in a real environment. The libp2p library tests it in the same way=)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I see your point, I have to respectfully disagree. I think the port will help us make the tests even more useful. Right now we can't assert much more than !multiaddrs.is_empty()
, because there's no way for us to control which DNS records we get when we look up bootstrap.libp2p.io
. With the port, we for example can test that our logic parses the TXT records correctly.
As for another example, there's currently no way to know from this test if this test is doing a recursive lookup or just a plain single lookup - and this can vary depending on how libp2p.io
configures their records. With the trait we can write multiple test cases to test and make clear assertions about different scenarios.
I don't think it would be too hard to put a trait between our code and the TokioAsyncResolver
, because as far as I can see, we're only calling TokioAsyncResolver::txt_lookup
so we only need to mock one method for the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will try to put that in another way=)
The main functionality that I would like to see tested is that we can resolve the real addresses from ipfs(or fuel testnet/mainnet). So, I still want to see the test that uses real dnsaddr
and verifies that TokioAsyncResolver
works as expected.
The functionality of resolve_recursive
and how this function works are not so important to me.
The approach with the port for internal resolver can help to test the behaviour of the resolve_recursive
function in different use cases. And it looks like the right call to do it.
It is up to @rymn. I'm okay with doing that in a separate PR just dedicated to better test coverage of the resolve_recursive
since the main feature is implemented and we have an integration test that covers this feature=)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about this? I have some dnsaddr's on my personal domain anyway, we can just reuse those with recursion and assert strictly about the multiaddrs this function spits out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'd prefer if we test against fuel domains but I'm okay with any domain within our control so if you use your personal domain for now and create a follow-up to set up fuel domains that would be my preferred option. Any owner of a hard-coded domain in our tests will have the power to block our CI temporarily so I'd rather have it be someone in the team.
I'd still find the port solution more readable, since this scenario is only verifiable by manually doing DNS lookups to check which records exists on the host machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 8c77c1d
i won't touch the records, i promise 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hahah pinky promise 😂 Nice comment also, super helpful thank you! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll re-review tonight, but my main concerns have been addressed. Thank you!
Concerns has been addressed, but I need to re-review before approving.
.lookup_dnsaddr("bootstrap.libp2p.io") | ||
.await | ||
.unwrap(); | ||
// run a `dig +short txt rymnc.com` to get the TXT records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err, this should be _dnsaddr.rymnc.com. patching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM=)
// notice that it contains - | ||
// `dnsaddr=/dnsaddr/zone-1.rymnc.com/tcp/4001/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN` | ||
// which is a recursive call | ||
let multiaddrs = resolver.lookup_dnsaddr("rymnc.com").await.unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is better to use main net dnsaddr=)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addressed in 92f5f22
Co-authored-by: Green Baneling <XgreenX9999@gmail.com>
// when | ||
// run a `dig +short txt _dnsaddr.mainnet.fuel.network` to get the TXT records | ||
let multiaddrs = resolver | ||
.lookup_dnsaddr("mainnet.fuel.network") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we switch this to core-test.fuellabs.net
? The records on the mainnet.fuel.network
record can and will change over time, but I just setup the core-test
DNS record to be static.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for setting this up. I created this follow-up issue to use the core-test.fuellabs.net
hostname in the test, since this PR got auto-merged after my approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍 It would be nice to use the test records Mike suggested, but that can be done as a follow-up if you want to get this merged promptly for release.
@@ -5,6 +5,7 @@ pub mod behavior; | |||
pub mod codecs; | |||
pub mod config; | |||
pub mod discovery; | |||
mod dnsaddr_resolution; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd move this declaration to a separate block than the public modules.
## Version v0.36.0 ### Added - [2135](#2135): Added metrics logging for number of blocks served over the p2p req/res protocol. - [2151](#2151): Added limitations on gas used during dry_run in API. - [2188](#2188): Added the new variant `V2` for the `ConsensusParameters` which contains the new `block_transaction_size_limit` parameter. - [2163](#2163): Added runnable task for fetching block committer data. - [2204](#2204): Added `dnsaddr` resolution for TLD without suffixes. ### Changed #### Breaking - [2199](#2199): Applying several breaking changes to the WASM interface from backlog: - Get the module to execute WASM byte code from the storage first, an fallback to the built-in version in the case of the `FUEL_ALWAYS_USE_WASM`. - Added `host_v1` with a new `peek_next_txs_size` method, that accepts `tx_number_limit` and `size_limit`. - Added new variant of the return type to pass the validation result. It removes block serialization and deserialization and should improve performance. - Added a V1 execution result type that uses `JSONError` instead of postcard serialized error. It adds flexibility of how variants of the error can be managed. More information about it in FuelLabs/fuel-vm#797. The change also moves `TooManyOutputs` error to the top. It shows that `JSONError` works as expected. - [2145](#2145): feat: Introduce time port in PoA service. - [2155](#2155): Added trait declaration for block committer data - [2142](#2142): Added benchmarks for varied forms of db lookups to assist in optimizations. - [2158](#2158): Log the public address of the signing key, if it is specified - [2188](#2188): Upgraded the `fuel-vm` to `0.57.0`. More information in the [release](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.57.0). ## What's Changed * chore(p2p_service): add metrics for number of blocks requested over p2p req/res protocol by @rymnc in #2135 * Weekly `cargo update` by @github-actions in #2149 * Debug V1 algorightm and use more realistic values in gas price analysis by @MitchTurner in #2129 * feat(gas_price_service): include trait declaration for block committer data by @rymnc in #2155 * Convert gas price analysis tool to CLI by @MitchTurner in #2156 * chore: add benchmarks for varied forms of lookups by @rymnc in #2142 * Add label nochangelog on weekly cargo update by @AurelienFT in #2152 * Log consensus-key signer address if specified by @acerone85 in #2158 * chore(rocks_db): move ShallowTempDir to benches crate by @rymnc in #2168 * chore(benches): conditional dropping of databases in benchmarks by @rymnc in #2170 * feat: Introduce time port in PoA service by @netrome in #2145 * Get DA costs from predefined data by @MitchTurner in #2157 * chore(shallow_temp_dir): panic if not panicking by @rymnc in #2172 * chore: Add initial CODEOWNERS file by @netrome in #2179 * Weekly `cargo update` by @github-actions in #2177 * fix(db_lookup_times): rework core logic of benchmark by @rymnc in #2159 * Add verification on transaction dry_run that they don't spend more than block gas limit by @AurelienFT in #2151 * bug: fix algorithm overflow issues by @MitchTurner in #2173 * feat(gas_price_service): create runnable task for expensive background polling for da metadata by @rymnc in #2163 * Weekly `cargo update` by @github-actions in #2197 * Fix bug with gas price factor in V1 algorithm by @MitchTurner in #2201 * Applying several breaking changes to the WASM interface from backlog by @xgreenx in #2199 * chore(p2p): dnsaddr recursive resolution by @rymnc in #2204 ## New Contributors * @acerone85 made their first contribution in #2158 **Full Changelog**: v0.35.0...v0.36.0
Linked Issues/PRs
dnsaddr
resolution #2202Description
dnsaddr_resolution
which handles recursive dnsaddr resolution to add to the DHT without suffix matching. This way we can connect to all peers behind a domain without specifying the exactPeerId
, like/dnsaddr/bootstrap.libp2p.io
.Checklist
Before requesting review
After merging, notify other teams
[Add or remove entries as needed]