Skip to content

Conversation

@jgallagher
Copy link
Contributor

@jgallagher jgallagher commented Jan 27, 2026

This builds on the BlueprintExpungedZoneAccessReason added in #9608, and adds a new set of zone IDs to PlanningInput for "pruneable zones". These are zones that the planner is safe to prune in a future planning iteration, because they are:

  1. Expunged and ready for cleanup
  2. No longer needed by the rest of the system, implying any necessary cleanup is done (e.g., Nexus zones no longer have any assigned sagas/support bundles) and any relevant information has been replicated elsewhere (e.g., external DNS IPs have been reused)

This PR only adds the new set of zones to PlanningInput. It will be nearly trivial to add a step to the planner to "prune all zones present in the planning input's pruneable_zones set", but out of an abundance of caution I'd prefer to land this first, collect a reconfigurator state from some deployed racks, and confirm the contents match what we'd expect on some real systems.

This is a little weird in that the logic for whether a zone can be pruned lives in the "prepare a planning input" crate rather than the planner itself, but I think this is reasonable given the implementation of several of the checks (e.g., "query CRDB for this expunged zone specifically" to see whether it's still referenced, which the planner can't do since it doesn't have access to CRDB). Happy to discuss alternatives if this seems wrong upon review.

(We may also want to wait on #7278 before adding the actual prune step?)

// We have no way to confirm that this zone is "unreferenced" - that's a
// property of the system at large, mostly CRDB state - but we can
// confirm that it's expunged and ready for cleanup by looking at the
// parent blueprint.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally going to add a couple blippy lints that any zone in expunged_and_unreferenced is (a) present and (b) actually expunged, but with the guard in this method and the fact that PlanningInput's fields are private, I don't believe it's actually possible to construct a PlanningInput with any such zone IDs in expunged_and_unreferenced, so it was impossible to write a test to confirm the blippy lints triggered.

Copy link
Collaborator

@davepacheco davepacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This is really clean. I've commented on one possible bug here and the rest is nitty feedback.

Comment on lines 184 to 185
clickhouse_config.keepers.contains_key(&zone_id)
|| clickhouse_config.servers.contains_key(&zone_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about this to evaluate this case -- may want to ask @karencfv or @andrewjstone to double-check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine? I'd like to get @andrewjstone's thoughts on this though just to be sure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I just read some of the code to remind me what is going on, and if there is no zone there, it means it must have been expunged. Since that's mainly what this PR is about, the same zone with random uuid is not coming back magically.

@jgallagher
Copy link
Contributor Author

Bunch of changes here (enough it's probably easier to rereview than look at the diff?):

  1. Rename "expunged and unreferenced" to "pruneable" throughout
  2. Add the runtime-built set of reasons checked plus a test that asserts they're all covered, as described in [reconfigurator] Add "pruneable zones" to PlanningInput #9730 (comment)
  3. Add a whole bunch of tests that cover all the "zone not pruneable" reasons we check (this was LLM-assisted and is quite a bit of code, but I think not an unreasonable amount for what it's doing?)

@jgallagher jgallagher changed the title [reconfigurator] Add "expunged and unreferenced" zones to PlanningInput [reconfigurator] Add "pruneable zones" to PlanningInput Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants