Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MOTD warning if Ignition is run more than once #1214

Closed
bgilbert opened this issue May 21, 2021 · 10 comments
Closed

Add MOTD warning if Ignition is run more than once #1214

bgilbert opened this issue May 21, 2021 · 10 comments
Assignees

Comments

@bgilbert
Copy link
Contributor

Feature Request

Environment

Any

Desired Feature

Add a systemd service to the Dracut module that runs on "first boot" but checks for evidence that this is not actually the first boot, such as /etc/machine-id or journal entries from previous boots. If it finds any, create an /etc/issue.d or MOTD fragment saying that rerunning Ignition is not supported and may not work correctly.

Other Information

The check will need to avoid false positives if Ignition is running in a live system with persistent state (e.g. persistent /var).

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 6, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about how many times the nodes has been booted to
help there.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 6, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about how many times the nodes has been booted to
help there.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 8, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 8, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 8, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 8, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 16, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
@cgwalters
Copy link
Member

The check will need to avoid false positives if Ignition is running in a live system with persistent state (e.g. persistent /var).

It seems to me there's two approaches here. First is to simply skip this check if we're running in a live system, which would need to be something like the is-live-image bit we have in FCOS (tied to FCOS), or the more stable /run/ostree-live (tied to ostree).

But, I think a stronger version is to write something like /etc/.ignition.stamp or so after Ignition runs (maybe a JSON formatted file with a bit of metadata about the Ignition that was run, at least its sha512?). Since basically Ignition should mostly be about writing config in /etc and by having a stamp file in /etc we can easily detect the overwriting case.

@cgwalters
Copy link
Member

Actually though thinking about this more, the original suggestion above of checking for /etc/machine-id seems like it'd be correct in all scenarios and not require introducing any new files.

jlebon added a commit to jlebon/fedora-coreos-config that referenced this issue Jul 19, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
@bgilbert
Copy link
Contributor Author

is-live-image is tied to Ignition, not to FCOS. Every distro that implements Ignition in a live environment is expected to provide that command. If we're implementing the MOTD warning in upstream Ignition, is-live-image is the defined way to check for a live environment.

I don't think we need to define a new state file just for this check, but I also don't think we should limit ourselves to /etc/machine-id because the user could just remove that too. Ideally the check would be reasonably robust against spoofing. For that reason, I like the idea of both checking /etc/machine-id and checking the journal for any record of previous boots.

@bgilbert
Copy link
Contributor Author

There are tradeoffs between doing this in the initrd or the real root, though. The initrd has is-live-image and doesn't provide a way to disable units, which are both pluses, but reading the journal from there is a bit awkward and feels like a layering violation.

jlebon added a commit to coreos/fedora-coreos-config that referenced this issue Jul 19, 2021
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
@jlebon
Copy link
Member

jlebon commented Jul 19, 2021

I don't think we need to define a new state file just for this check, but I also don't think we should limit ourselves to /etc/machine-id because the user could just remove that too. Ideally the check would be reasonably robust against spoofing.

Meh... IMO just checking /etc/machine-id is good enough. Whatever we check for can be worked around. At some point we have to just let the user do what they want and not try to outsmart them even if at our level it looks like they're doing something wrong. /etc/machine-id is a good test for that I think because removing it is much more likely to be intentional vs accidental (which I think is closer to what we're trying to catch).

@bgilbert
Copy link
Contributor Author

Yeah, I was being sloppy by saying "the user" above. The problem isn't so much individual users, but tools. If an intermediate tool is running Ignition twice, end users of that tool may not realize that there are downstream consequences that affect them. So it'd be good for the warning to have a decent chance of reaching the end user unless the tool intentionally blocks it; in the latter case it's clear that the tool authors know about the problem and are taking responsibility for it.

There's already a technical reason for a tool to remove /etc/machine-id (Ignition systemd unit enablement won't work without it) so it doesn't really serve that purpose. The journal is better, but there are legitimate reasons to remove it too. I suppose the most explicit approach would be to create a flag file such as /var/run/ignition-may-not-work-correctly-if-run-twice.

@jlebon
Copy link
Member

jlebon commented Jul 19, 2021

Actually I think this would be trivial to add on top of coreos/fedora-coreos-config#1086 now that we have /var/lib/coreos/ignition.info.json. Though it implies that the logic would live in f-c-c and not here.

@sohankunkerkar
Copy link
Member

Actually I think this would be trivial to add on top of coreos/fedora-coreos-config#1086 now that we have /var/lib/coreos/ignition.info.json. Though it implies that the logic would live in f-c-c and not here.

I think it seems like a good idea.

@bgilbert
Copy link
Contributor Author

bgilbert commented Jul 21, 2021

From OOB discussion with @jlebon: let's land #1250 or equivalent, then add a unit to the Dracut module that runs after mount stage and before files stage, checks whether the result file already exists, and writes the MOTD fragment if so. By making that decision using a file specific to Ignition, we avoid false negatives where some tool removes the file for unrelated reasons.

sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Jul 30, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Jul 30, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 2, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 2, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 3, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 4, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 4, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 5, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to sohankunkerkar/ignition that referenced this issue Aug 5, 2021
This change notices if a report already exists and accordingly
nest the report in previousReport.
related to coreos#1214
sohankunkerkar added a commit to coreos/fedora-coreos-config that referenced this issue Aug 6, 2021
This change adds a warning on the serial console if Ignition
is run more than once. This is related to coreos/ignition#1214
@sohankunkerkar
Copy link
Member

Closing this issue as both PRs #1254 and coreos/fedora-coreos-config#1148 (which addresses this issue) got merged.

HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
This change adds a warning on the serial console if Ignition
is run more than once. This is related to coreos/ignition#1214
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Some users sometimes may not realize that they're using a pre-booted
version of a CoreOS image. This makes things confusing because they then
don't understand why the Ignition config wasn't applied.

There's no way to consistently detect this, but at least we can print an
informational message about (1) when Ignition ran, and (2) how many
boots ago that was.

This enhances the Ignition issue we already write for whether a
user config was provided rather than creating a separate one.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1977949
Related: coreos/ignition#1214
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
This change adds a warning on the serial console if Ignition
is run more than once. This is related to coreos/ignition#1214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants