Skip to content

DNS servers should have NS and SOA records #8047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 57 commits into
base: main
Choose a base branch
from
Open

Conversation

iximeow
Copy link
Member

@iximeow iximeow commented Apr 24, 2025

this is probably the more exciting part of the issues outlined in #6944. the changes here get us to the point that for both internal and external DNS, we have:

  • A/AAAA records for the DNS servers in the internal/external group (named ns1.<zone>, ns2.<zone>, ...)
  • NS records for those servers at the zone apex, one for each of the ns*.<zone> described above
  • an SOA record synthesized on-demand for the zone apex for each of oxide.internal (for internal DNS) and $delegated_domain (for external DNS)
  • the SOA's serial is updated whenever the zone is changed. serial numbers are effectively the DNS config generation, so they start from 1 and tick upward with each change. this is different from most SOA serial schemes (in particular the ones that would use YYYYMMDDNN numbering schemes) but so far as i can tell this is consistent with RFC 1035 requirements.

we do not support zone transfers here. i believe the SOA record here would be reasonable to guide zone transfers if we did, but obviously that's not something i've tested.

SOA fields

the SOA record's RNAME is hardcoded to admin@<zone_name>. this is out of expediency to provide something, but it's probably wrong most of the time. there's no way to get an MX record installed for <zone_name> in the rack's external DNS servers, so barring DNS hijinks in the deployed environment, this will be a dead address. problems here are:

  • we would want to take in an administrative email at rack setup time, so that would be minor plumbing
  • more importantly, what to backfill this with for deployed systems?

it seems like the best answer here is to allow configuration of the rack's delegated domain and zone after initial setup, and being able to update an administrative email would fit in pretty naturally there. but we don't have that right now, so admin@ it is. configuration of external DNS is probably more important in the context of zone transfers and permitting a list of remote addresses to whom we're willing to permit zone transfers. so it feels like this is in the API's future at some point.

bonus

one minorly interesting observation along the way is that external DNS servers in particular are reachable at a few addresses - whichever public address they get in the rack's internal address range, and whichever address they get in the external address range. the public address is what's used for A/AAAA records. so, if you're looking around from inside a DNS zone you can get odd-looking answers like:

# 172.30.1.5 is the internal address that an external DNS server is bound to.
# oxide.test is the delegated domain for this local Omicron deployment.
root@oxz_external_dns_68c5e255:~# dig +short ns2.oxide.test @172.30.1.5
192.168.0.161
root@oxz_external_dns_68c5e255:~# dig +short soa oxide.test @172.30.1.5
ns1.oxide.test. admin.oxide.test. 2 3600 600 18000 150
root@oxz_external_dns_68c5e255:~# dig +short ns oxide.test @172.30.1.5
ns1.oxide.test.
ns2.oxide.test.
# 192.168.0.160 is an external address for this same server.
# there are no records referencing 172.30.1.5 here.
root@oxz_external_dns_68c5e255:~# dig +short ns oxide.test @192.168.0.160
ns1.oxide.test.
ns2.oxide.test.
root@oxz_external_dns_68c5e255:~# dig +short ns1.oxide.test @192.168.0.160
192.168.0.160

@iximeow iximeow added the release notes reminder to include this in the release notes label Apr 24, 2025
@iximeow iximeow force-pushed the ixi/dns-ns-and-soa branch 2 times, most recently from 842455b to f349290 Compare April 25, 2025 21:50
@iximeow iximeow force-pushed the ixi/dns-ns-and-soa branch from f349290 to fa47ab1 Compare April 25, 2025 22:08
Comment on lines +174 to +207
impl From<Srv> for DnsRecord {
fn from(srv: Srv) -> Self {
DnsRecord::Srv(srv)
}
}

#[derive(
Clone,
Debug,
Serialize,
Deserialize,
JsonSchema,
PartialEq,
Eq,
PartialOrd,
Ord,
)]
pub struct Srv {
pub prio: u16,
pub weight: u16,
pub port: u16,
pub target: String,
}

impl From<v1::config::Srv> for Srv {
fn from(other: v1::config::Srv) -> Self {
Srv {
prio: other.prio,
weight: other.weight,
port: other.port,
target: other.target,
}
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the other option here is to use the v1::config::Srv type directly in v2, because it really has not changed. weaving the V1/V2 types together seems more difficult to think about generally, but i'm very open to the duplication being more confusing if folks feel that way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably use the v1 types directly but I can see going either way.

@iximeow iximeow marked this pull request as ready for review May 1, 2025 21:58
@@ -4,9 +4,12 @@ load-example --seed test_expunge_newly_added_external_dns

blueprint-show 3f00b694-1b16-4aaa-8f78-e6b3a527b434
blueprint-edit 3f00b694-1b16-4aaa-8f78-e6b3a527b434 expunge-zone 9995de32-dd52-4eb1-b0eb-141eb84bc739
blueprint-diff 3f00b694-1b16-4aaa-8f78-e6b3a527b434 366b0b68-d80e-4bc1-abd3-dc69837847e0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately, between the diff size and having conflicting changes on main, i had a hard time keeping the output a more legible "file moved and now has some additional lines". instead, git shows the diff as a fully new file even though it's mostly the prior content.

blueprint-diff includes the DNS output though, which is of course what i actually care about here. if this is a bear to review (and i'm pretty empathetic to it being a lot) i'm open to moving the DNS checking over to a new test and leaving this unchanged, or moving the internal DNS testing to live in this test as well.


blueprint-show 62422356-97cd-4e0f-bd17-f946c25193c1
blueprint-edit 62422356-97cd-4e0f-bd17-f946c25193c1 expunge-zone 3fc76516-d258-48bc-b25e-9fca5e37c888
blueprint-diff 62422356-97cd-4e0f-bd17-f946c25193c1 14b8ff1c-91ff-4ab7-bb64-3c0f5f642e09
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one surprised me and i've added this diff to reiterate that for testing: internal DNS zones are not replaced simply as a result of being expunged, since we might need to reuse the IP that server was listening on. for internal DNS in particular, the expunged zone must be ready_for_cleanup. i don't know concretely what that means (sled-agent did a collection and saw the zone is gone?), but that's a critical step in actually seeing DNS changes in the diff below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know concretely what that means (sled-agent did a collection and saw the zone is gone?)

Almost! Reconfigurator will mark a zone ready for cleanup during planning if in the most recent inventory collection, sled-agent reported:

  • the zone is gone
  • the generation of the sled's config is >= the generation in which the zone was expunged (to avoid a race where the zone is gone because it hasn't even started yet)

iximeow added 14 commits May 22, 2025 20:33
on one hand: now that DNS servers are referenced by potentially two
different AAAA records, both of those records are potentially the target
of a SRV record. though, we don't have SRV records for the DNS
interface. this test had failed at first because we'd find a DNS
server's IP via the `ns1.` record, which means we'd miss that the same
zone was referenced by an AAAA record for the illumos zone UUID.

on the other hand: #[nexus_test] environments involve a mock of the
initial RSS environment construction? so now that the first blueprint
adds NS records, this mock RSS environment was out of date, and a test
that the first blueprint after "RSS" makes no change failed because the
"RSS" environment was wrong.
each name has a list of records so calling the high-level collection
"records" makes for some confusing words
Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- this is looking pretty good! I don't think anything here is a real blocker but it would be good to cleanup if we can.

Comment on lines +445 to +448
///
/// this typically does not mean anything different than any other expunged
/// zone, except that internal DNS zones are not replaced until they are
/// definitively marked "ready for cleanup".
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd strike this. I don't think that's true. IIRC Nexus zones have a cleanup step that involves re-assigning sagas and Cockroachdb zones have a step that decommissions nodes, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that makes sense. i'd looked for where else ready_for_cleanup is used but probably missed some details. i hadn't realized at this point that i can write # comments in the reconfigurator-cli tests anyway, which is really where i wanted to highlight this command.

Comment on lines 9 to 11
# Mark the internal DNS zone ready for cleanup.
# This approximates sled-agent performing an inventory collection and seeing the DNS zone has gone away.
# Afterward, diffing should show that the server's records are removed from DNS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the behavior here is correct but the comment seems wrong to me. The DNS records for the internal DNS server expunged at L5 were shown as removed in the diff at L7, right? And there are no DNS changes in the diff at L13. This is the behavior I'd expect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed on both counts. i'd meant to emphasize that until we mark-for-cleanup, a new plan won't add a new internal DNS zone even though the old one had been expunged, but misremembered what i'd seen in the output and said it pretty poorly. lemme clean that up too..

dropshot::HttpError,
> {
let result = Self::dns_config_get(rqctx).await?;
match result.0.try_into() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. It seems like the API versioning stuff worked out nicely here.

Comment on lines 600 to 604
anyhow::ensure!(
service == ServiceName::ExternalDns,
"This method is only valid for external DNS servers, \
but we were provided the service '{service:?}'",
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I'd remove this argument altogether.

let dns_config_blueprint = DnsConfigParams {
zones: vec![dns_zone_blueprint],
time_created: chrono::Utc::now(),
generation: blueprint_generation.next(),
serial: new_dns_generation.as_u64().try_into().map_err(|_| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see -- it looks like you split the difference here. The configuration distinguishes between "serial" and "generation", but this is the only place that sets them, and it always makes them the same. So we don't have to worry about maintaining a serial in lockstep with the generation when we update the database.

This seems fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i really like the status quo that there is not a DnsConfigParams which can result in the DNS server failing to serve records. to maintain that either DnsConfigParams::generation should become a u32 (seems very wrong), or serial ends up a distinct u32.

#[derive(Clone, Debug, Serialize, Deserialize, JsonSchema, PartialEq, Eq)]
pub struct DnsConfigZone {
pub zone_name: String,
pub names: HashMap<String, Vec<DnsRecord>>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nitty and unimportant, but: I feel like records was more accurate. I guess I expect maps to be named either by what each key-value pair represents or what the value represents, not what the key represents. But now I wonder how universal that is!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose i was thinking about this as: a "name" is the pair of a label and a collection of records, and we often happen to call the label a "name". that's not totally accurate, since the key here could be multiple labels anyway. but i agree with your instinct and this is why it didn't strike me as confusing at first :)

this was a simple change, i'll probably revert it and add a few comments on the relevant test asserts instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could be right if people read "DNS name" to refer to the (label, records) pair. I tend to use that interchangeably with "label" but maybe that's wrong.

Anyway, not a big deal either way, though there's something to be said for not having different names for the same thing in two different API versions. Then again, we can probably remove API version 1 in the next release anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already reverted it! i expect i'm the outlier here, and either way it ends up ambiguous in some circumstances.

iximeow and others added 3 commits May 30, 2025 12:57
Co-authored-by: David Pacheco <dap@oxidecomputer.com>
confusing name options abound. "names" is ambiguous with the keys,
"records" is ambiguous with the values, maybe it would be better to call
this "subdomains"???? but for now stick with what we've got and add some
clarifying comments.

This reverts commit ff63ea1.
* incorrect comments around the internal DNS expunge test
* internal DNS config does not need to track external DNS separately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release notes reminder to include this in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants