Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instance create sagas look up multiple things by name multiple times #1536

Open
davepacheco opened this issue Aug 2, 2022 · 3 comments
Open
Labels
security Related to security.
Milestone

Comments

@davepacheco
Copy link
Collaborator

There are three places in the instance-create saga where we look up things by name well into the saga:

  1. Towards the end, when we go to attach disks (or detach them, during undo), we use Nexus::instance_attach_disk()/Nexus::instance_detach_disk() and pass along the organization_name, project_name, instance_name, and disk_name from the instance create request. This causes a new lookup of the organization/project/instance/disk path. That seems bad: there's no guarantee that it will find the same disk (or instance or project or organization, for that matter). So we might attach or detach the wrong disk, or we might fail spuriously.
  2. At the end, during instance_ensure(), we look up the instance by name in order to call Nexus::instance_set_runtime(). This could find a different instance than the one we created. We should be able to use the instance_id stored from a previous action instead.
  3. In sic_delete_instance_record, the undo action for the (early) create-instance-record action, we look up the instance name that we created. We should use the instance id instead.

RFD 192 talks about this:

When Nexus receives an API request, we’ll usually want to resolve the name to an id up front and do most internal work using ids. Otherwise, we can wind up with nasty concurrency issues (i.e., serializability violations) if somebody renames a resource while we’re working on it. This can lead to security vulnerabilities, if we were to do an access check of some resource and then somebody renames it and we wind up operating on a different resource than the one we checked.

In terms of fixing: my first thought here is to do the lookup of organization and project once early in the saga to get the project_id. Then (again, still early in the saga) look up anything else we need (like all the disks) using LookupPath(...).project_id(...).disk_name(...) and store their ids. Thereafter, we can always use the ids.

We'll probably want to change Nexus::instance_attach_disk() and Nexus::instance_detach_disk() to accept an authz::Instance and authz::Disk (and have the lookups done in the caller).

Then I think we can remove the "instance_name" output from any saga action (so we're not ever tempted to use the instance name). We can also remove organization_name and project_name from the Params (for the same reason).

@davepacheco davepacheco added the security Related to security. label Aug 2, 2022
@davepacheco davepacheco mentioned this issue Aug 2, 2022
5 tasks
@davepacheco
Copy link
Collaborator Author

This seems easy to get wrong. It makes me wonder if we should have the Nexus layer only accept authz resources (rather than ever accepting names) and having callers (mostly the HTTP layer, but sometimes the saga layer) use LookupPath directly. I think it'd be fair to say that the mapping of an API URL path to a resource is part of the API layer anyway, not Nexus.

I'm reminded too that right after that quote from RFD 192, it also says that some operations could be done without a separate lookup step altogether. To do that, you do want to pass the names all the way down to the datastore. Doing this correctly also requires integrating authz into the query, which feels like a long way off. So maybe we could consider this an optimization to be re-added later.

A middle ground might be to accept LookupPaths. That would avoid encouraging people to pass a name without actually preventing it. It would create its own problem, which is that if you've already got the authz object, you'll wind up needing to make a LookupPath out of it and then we'll wind up looking it up again. This too could be addressed: we could have a LookupPath::instance_value or something that, when you go do the lookup(), just returns what it was given. This approach seems like a can of worms not worth opening now. Maybe we can look into this if we decide that optimization is important.

@zephraph
Copy link
Contributor

Once RFD-322 starts getting implemented I believe we should solve this at the API level. E.g. we should at least move everything to operate as id by default. Id or LookupPath might be a viable option depending, but I think we should restrict the usage of names (as much as possible) to go no deeper than the http_entrypoints file.

@davepacheco davepacheco added this to the MVP milestone Feb 2, 2023
leftwo pushed a commit that referenced this issue Nov 19, 2024
No Propolis changes other than to update Crucible

Crucible changes are:
Add debug/timeout to test_memory.sh (#1563)
Consolidate ack checking (#1561)
Rename for crutest: RegionInfo -> DiskInfo (#1562)
Fix dtrace system level scripts (#1560)
Remove `ackable_work`; ack immediately instead (#1552)
No more New jobs, no more New jobs column (#1559)
Remove delay-based backpressure in favor of explicit queue limits (#1515)
Only send flushes when Downstairs is idle; send Barrier otherwise (#1505)
Update Rust crate reqwest to v0.12.9 (#1536)
Update Rust crate omicron-zone-package to 0.11.1 (#1535)
Remove separate validation array (#1522)
Remove more unnecessary `DsState` variants (#1550)
Consolidate `DownstairsClient::reinitialize` (#1549)
Update Rust crate uuid to v1.11.0 (#1546)
Update Rust crate reedline to 0.36.0 (#1544)
Update Rust crate bytes to v1.8.0 (#1541)
Update Rust crate thiserror to v1.0.66 (#1539)
Update Rust crate serde_json to v1.0.132 (#1538)
Update Rust crate serde to v1.0.214 (#1537)
Remove transient states in `DsState` (#1526)
Update Rust crate libc to v0.2.161 (#1534)
Update Rust crate futures to v0.3.31 (#1532)
Update Rust crate clap to v4.5.20 (#1531)
Update Rust crate async-trait to 0.1.83 (#1530)
Update Rust crate anyhow to v1.0.92 (#1529)
Remove obsolete crutest perf test (#1528)
Update dependency rust to v1.82.0 (#1512)
Still more updates to support Volume layer activities. (#1508)
Remove remaining IOPS/bandwidth limiting code (#1525)
Add unit test for VersionMismatch (#1524)
Removing panic paths by only destructuring once (#1523)
Update actions/checkout digest to 11bd719 (#1518)
Switch to using `Duration` for times (#1520)
leftwo added a commit that referenced this issue Nov 20, 2024
No Propolis changes other than to update Crucible

Crucible changes are:
Add debug/timeout to test_memory.sh (#1563)
Consolidate ack checking (#1561)
Rename for crutest: RegionInfo -> DiskInfo (#1562) Fix dtrace system
level scripts (#1560)
Remove `ackable_work`; ack immediately instead (#1552) No more New jobs,
no more New jobs column (#1559)
Remove delay-based backpressure in favor of explicit queue limits
(#1515) Only send flushes when Downstairs is idle; send Barrier
otherwise (#1505) Update Rust crate reqwest to v0.12.9 (#1536)
Update Rust crate omicron-zone-package to 0.11.1 (#1535) Remove separate
validation array (#1522)
Remove more unnecessary `DsState` variants (#1550) Consolidate
`DownstairsClient::reinitialize` (#1549) Update Rust crate uuid to
v1.11.0 (#1546)
Update Rust crate reedline to 0.36.0 (#1544)
Update Rust crate bytes to v1.8.0 (#1541)
Update Rust crate thiserror to v1.0.66 (#1539)
Update Rust crate serde_json to v1.0.132 (#1538)
Update Rust crate serde to v1.0.214 (#1537)
Remove transient states in `DsState` (#1526)
Update Rust crate libc to v0.2.161 (#1534)
Update Rust crate futures to v0.3.31 (#1532)
Update Rust crate clap to v4.5.20 (#1531)
Update Rust crate async-trait to 0.1.83 (#1530)
Update Rust crate anyhow to v1.0.92 (#1529)
Remove obsolete crutest perf test (#1528)
Update dependency rust to v1.82.0 (#1512)
Still more updates to support Volume layer activities. (#1508) Remove
remaining IOPS/bandwidth limiting code (#1525) Add unit test for
VersionMismatch (#1524)
Removing panic paths by only destructuring once (#1523) Update
actions/checkout digest to 11bd719 (#1518)
Switch to using `Duration` for times (#1520)

Co-authored-by: Alan Hanson <alan@oxide.computer>
hawkw added a commit that referenced this issue Jan 30, 2025
There's a comment in the `instance_create` saga`s
`sic_delete_instance_record` compensating action that states it would be
nicer to lookup the instance record to delete by ID rather than by
project and name. The comment references issue
#1536.

While I was here changing this action, I figured I'd go ahead and change
this as well. My assumption is that the previous thing predates the
`LookupPath::instance_id` method?
@hawkw
Copy link
Member

hawkw commented Jan 30, 2025

#7437 changes the sic_delete_instance_record action to look up the instance by ID rather than by name. The disk attach actions still look up disks by project/instance/disk name. The instance_ensure call is no longer performed by the instance_create saga, and is now part of the instance-start saga instead.

hawkw added a commit that referenced this issue Jan 30, 2025
When the `instance_create` saga's `sic_create_instance_record` action
unwinds, it executes the compensating action
[`sic_delete_instance_record`][1].

This action moves the instance's state to `Failed` prior to actually
calling into `project_delete_instance` to delete it, [here][2]. This
is because we presently only allow instances to be deleted when they
are in the `Stopped` or `Failed` states, as noted [here][3].

Because we must first transition the instance to `Failed` in order to
delete it, there is an intermediate state when the instance
record created by an unwinding saga exists but is in the `Failed`
state. Instances in the `Failed` state are eligible to be restarted by
the `instance_reincarnation` background task. That task queries for all
`Failed`` instances and creates `instance_start` sagas to attempt to
restart them.

Therefore, if the `instance_reincarnation` background task runs during
the window of time between when an unwinding `instance_create` saga
marks the instance record as `Failed` and when it actually attempts to
call `project_delete_instance`, the instance can transition to
`Starting` by the new instance-start saga. This results in the attempt
to delete it failing, causing the unwinding saga to get stuck. This is
not great --- it causes a test flake (see #7326), but it's actually a
real bug, as it can result in a saga unable to unwind.

This commit fixes this by moving most of `project_delete_instance` to a
new function, `project_delete_instance_in_state`, which accepts a list
of states in which the instance may be deleted as an argument.
`project_delete_instance` now calls that function with the "normal" list
of states, but unwinding instance-create sagas are additionally able to
allow the instance record to be deleted while it's `Creating` in a
single atomic database operation, avoiding the transient `Failed` state.
This fixes #7326.

Unfortunately, it's quite challenging to have a regression test for
this, because it would require interrupting the unwinding saga's
`sic_delete_instance_record` mid-activation, which we don't really have
a nice mechanism for.

Also, there was a comment in the `sic_delete_instance_record` action
that stated we should be looking up the instance record to delete by
ID rather than by project and name. The comment references issue #1536.
While I was here changing this action, I figured I'd go ahead and change
this as well. My assumption is that the previous thing predates the
`LookupPath::instance_id` method?

[1]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L970
[2]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L1010-L1033
[3]: https://github.com/oxidecomputer/omicron/blob/ec4b5dc3c0c45b667e57a52389d82382b0b59112/nexus/src/app/sagas/instance_create.rs#L987-L988
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Related to security.
Projects
None yet
Development

No branches or pull requests

3 participants