Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strategy: add new marker_file strategy #540

Closed
wants to merge 3 commits into from
Closed

strategy: add new marker_file strategy #540

wants to merge 3 commits into from

Conversation

kelvinfan001
Copy link
Member

@kelvinfan001 kelvinfan001 commented Apr 28, 2021

Add a simple, flexible strategy that checks /var/lib/zincati for
a JSON file (optionally containing an expiry timestamp key) named
allowfinalize.json and allows update finalization if such file
exists and has the proper permissions.

See #245 (comment) for the rationale for this new strategy.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sane to me!

src/main.rs Outdated Show resolved Hide resolved
src/strategy/mod.rs Outdated Show resolved Hide resolved
src/strategy/filesystem.rs Outdated Show resolved Hide resolved
src/strategy/filesystem.rs Outdated Show resolved Hide resolved
src/strategy/filesystem.rs Outdated Show resolved Hide resolved
let mode = attr.permissions().mode();

// Octal 100644 represents regular file, rw-r--r--.
if mode != 0o100644 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed it, why are we strictly requiring 0644?

Copy link
Member Author

@kelvinfan001 kelvinfan001 Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this would be to prevent non-root users from being able to modify it, since this directly affects reboots.

Edit: I found out that as long as we set the directory to be owned by Zincati and not writable by others that'll satisfy our requirements for not letting non-root users add and remove files. So we just need to require it to be a regular file and not writable by others.

src/strategy/filesystem.rs Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
@lucab lucab added this to the vNext milestone Apr 29, 2021
@lucab
Copy link
Contributor

lucab commented Apr 29, 2021

I'd like to flush out the current pile of changes into a release before getting into reviewing this.

Additionally, this should come with some docs for users.

@kelvinfan001
Copy link
Member Author

@lucab yep, still working on the docs and possibly some kola tests. (sorry forgot to apply on-hold label).

src/strategy/filesystem.rs Outdated Show resolved Hide resolved
src/strategy/filesystem.rs Outdated Show resolved Hide resolved
src/strategy/filesystem.rs Outdated Show resolved Hide resolved
@kelvinfan001
Copy link
Member Author

Notes:
After this basic functionality is merged, we should consider adding an inotify watch on the allowfinalize.json file to tick now so reboots happen immediately if an update is staged.
Also consider extending the marker file with an additional persistAfterReboot key where Zincati will remove the file before rebooting if false.

anyhow::bail!("file is not regular file");
}

ensure_owned_by_root(&attr)?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this check if /var/lib/zincati is owned by the zincati user already anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we trust systemd-tmpfiles for that (also, it's probably fine if the hierarchy is owned by root either).

@kelvinfan001 kelvinfan001 changed the title strategy: add new filesystem strategy strategy: add new marker_file strategy May 17, 2021
@lucab lucab modified the milestones: v0.0.21, vNext May 18, 2021
Comment on lines +75 to +76
- named `allowfinalize.json`
- under `/var/lib/zincati/admin/strategy/marker_file`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    - named `allowfinalize.json`
    - under `/var/lib/zincati/admin/strategy/marker_file`

I wonder how this will work if you have multiple containers that might need to control reboot. If they are all trying to edit a single file then it gets problematic. Could or should we support multiple files under here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a legit point but I'd say it's outside of the scope of this strategy. Specifically, here we have a bias towards a single controller who can be audited/blamed for finalization decisions.

Multiple un-coordinated agents are a likely interesting case but it possibly needs either 1) higher-level coordination, or 2) a different approach (i.e. some kind of rule to determine who makes the final go/no-go call).

If any of the above is not satisfied in your marker file, Zincati will not allow reboots.

`allowfinalize.json` can optionally contain an `allowUntil` key with a Unix timestamp integer as its value to indicate the expiry date and time of this marker file. If the current time timestamp is _greater than or equal to_ this timestamp, then reboots will not be allowed.
Otherwise, if the `allowUntil` key is not present, reboots will be allowed for as long as `allowfinalize.json` exists (in the right location), and it must be removed to disallow reboots.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking aloud: do we need this to be optional right now? I am not fully sold on this way to say "allow all finalizations forever". Should we maybe start with making this mandatory and relax this to an Option later on if there is a need to?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the ability to leave a (valid) file in the directory forever would allow for the use case of "allow reboots for the most part unless I need to disallow it for a while". So for example, if theoretically Dusty's containers had some central controller, that controller could easily make use of this new strategy.

| sudo tee /var/lib/zincati/admin/strategy/marker_file/allowfinalize.json
```

Warning: In `jq` versions `1.6` and lower, `jq` [may output incorrect Unix timestamps][jq_bug] for certain datetimes on machines with certain `localtime`s.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds scary enough, do we want fromdateiso8601 maybe instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fromdateiso8601 is the same as fromdate right now, but right, I think fromdateiso8601 would be more explicit.

anyhow::bail!("file is not regular file");
}

ensure_owned_by_root(&attr)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we trust systemd-tmpfiles for that (also, it's probably fine if the hierarchy is owned by root either).

@lucab
Copy link
Contributor

lucab commented Jun 1, 2021

Dusty was commenting out-of-band that the name here is still somehow suboptimal, possibly because still too generic.
I do share the same feeling, but I didn't come up with better ideas so far.

/cc @dustymabe

@lucab
Copy link
Contributor

lucab commented Jun 1, 2021

@kai-uwe-rommel we do envision this strategy (name still to be finalized) to provide the basic primitive for a central manager (like Ansible) pushing finalization markers to individual hosts.
I tried to put together a very simple playbook to see how the integration could look like, at https://github.com/lucab/ansible-zincati-marker-file (no fancy stuff, just basic file writing/deleting).
Can you give a look at all of this and see if it would generally fit into your expected flow?

@dustymabe
Copy link
Member

Dusty was commenting out-of-band that the name here is still somehow suboptimal, possibly because still too generic.
I do share the same feeling, but I didn't come up with better ideas so far.

Yes. We were discussing this in the context of default allow versus default disallow (inhibit). This name is only sub-optimal if it doesn't cover both of those scenarios (which it currently doesn't) and we want to add another strategy later for the default to allow case in which case we might want to make it more obvious what the difference is between the two strategies.

Random ideas:

  • green_file (green light on a stop light for "go") - red_file in the future if we want to support disallowing update via file strategy
  • allow_file/disallow_file

I'm not really in love with any of them.

@kelvinfan001
Copy link
Member Author

kelvinfan001 commented Jun 1, 2021

We were discussing this in the context of default allow versus default disallow (inhibit). This name is only sub-optimal if it doesn't cover both of those scenarios (which it currently doesn't) and we want to add another strategy later for the default to allow case in which case we might want to make it more obvious what the difference is between the two strategies.

Sorry if this had already been discussed/clarified before elsewhere, but I still feel like default allow and default disallow is not too important here for this strategy. In theory, if there were a reliable single controller that is in charge of writing and removing a marker file, then the "defaults" for this strategy is totally up to the controller, right? I do recall vaguely it being mentioned that it is easier to touch a file than to remove a file, or vice versa, and so there might be some preference around whether we would like to default to Zincati being stuck in either constantly allowing reboots or constantly denying reboots. But I feel like this is a separate issue.

I think the key point for this strategy, in my opinion, is the emphasis that there should be a single controller that is the single decider on whether finalizations are allowed. If multiple processes/containers wanted to express their intent to allow/disallow reboots, that should be communicated to the central controller, rather than to Zincati's update strategy directly. Let me know if I've misunderstood anything :)

@kai-uwe-rommel
Copy link

@kai-uwe-rommel we do envision this strategy (name still to be finalized) to provide the basic primitive for a central manager (like Ansible) pushing finalization markers to individual hosts.
I tried to put together a very simple playbook to see how the integration could look like, at https://github.com/lucab/ansible-zincati-marker-file (no fancy stuff, just basic file writing/deleting).
Can you give a look at all of this and see if it would generally fit into your expected flow?

Thanks. Yes, that looks good, i.e. practical. It would allow us to explicitly upgrade a bunch of machines via a central action (Ansible) within a defined maintenance window.
In our case, these machines are load balancers and NFS servers for OCP/OKD clusters.
So we need to strictly adhere to defined maintenance windows.
Currently, we maintain the upgrades of the machines manually ...

@kelvinfan001
Copy link
Member Author

@lucab I think this should be ready for another look as well.

@lucab lucab modified the milestones: v0.0.22, vNext Jul 7, 2021
@dustymabe
Copy link
Member

@lucab I think this should be ready for another look as well.

bump

@lucab lucab modified the milestones: v0.0.23, vNext Aug 5, 2021
Add a simple, flexible strategy that checks the directory
`/var/lib/zincati/admin/marker_file_strategy` for a JSON file
(optionally containing an expiry timestamp key) named
`allowfinalize.json` and allows update finalization if such file
exists and has the proper permissions.
This explains how the `marker_file` strategy works and how to
configure it.
@mysticaltech
Copy link

Will be very useful! Keep up the good work folks.

@dustymabe
Copy link
Member

I'd like to revive this at some point. The PR was closed but it doesn't say who or why it was closed.

@jlebon
Copy link
Member

jlebon commented Mar 20, 2023

I'd like to revive this at some point. The PR was closed but it doesn't say who or why it was closed.

I think what happened there is that the fork was closed, so GitHub auto-closed all PRs from it.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

updates: new strategy based on local filesystem
7 participants