Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/daemon: add a current config on disk check #612

Merged
merged 1 commit into from
Apr 18, 2019

Conversation

runcom
Copy link
Member

@runcom runcom commented Apr 9, 2019

Signed-off-by: Antonio Murdaca runcom@linux.com

- What I did

Add an on disk fingerprint about what our current config is on a node (avoiding resync if we changed it or restored from an etcd backup). We check what we have on disk with what we have in annotations before triggering a sync.

- How to verify it

- Description for the changelog

@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 9, 2019
@runcom
Copy link
Member Author

runcom commented Apr 9, 2019

/hold

Was just discussing this with @derekwaynecarr but thought to start experimenting if this is needed/working

@openshift-ci-robot openshift-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 9, 2019
if !os.IsNotExist(err) {
return false, errors.Wrapf(err, "loading current config fingerprint")
}
return false, nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is wrong (note to self)

@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 9, 2019
@runcom runcom force-pushed the current-fingerprint branch 2 times, most recently from 485a3b6 to 6f5f21a Compare April 9, 2019 19:19
@@ -124,6 +125,9 @@ const (
pathDevNull = "/dev/null"
// pathStateJSON is where we store temporary state across config changes
pathStateJSON = "/etc/machine-config-daemon/state.json"
// currentConfigFingerprint is where we store the current config on disk to validate
// against annotations changes
currentConfigFingerprint = "/etc/machine-config-daemon/current"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a fingerprint, it's the whole config right? I find this idea that we serialize all of the files back into JSON as a file in the filesystem kind of funny. But I guess it's not a huge amount of data...

oc get -o yaml machineconfigs/rendered-worker-74e29c89eb37d7aaa30c33c481fae174 | gzip | wc -c
10197

We can certainly assume 10k of data is free.

On the kind-of bikeshed topic, I think /var is probably a better place for this. The ostree model is that /etc is data humans might edit (e.g. "ssh to node and vi").

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed naming and moved under /var/machine-config-daemon

return nil, nil, err
}

mcJSON, err := os.Open(currentConfigFingerprint)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to do this before we load currentConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhm, we may, I can't see why though, would you explain (I feel dumb)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the current config isn't available in the apiserver, won't the Get above fail before we can find our cached version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh gosh yeah

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No wait, the scenario here is that after a restore we still have something in the apiserver, which is really what we want on the machine itself, if current isn't on the apiserver, there's not much we can do anyway (which is the BZ we got from David yesterday but this PR isn't tackling that).

This is something I wrote to understand this better:

- current=configA == desired=configA
upgrade
- current=configA != desired=configB (desired being generated by upgrade and it's being applied as new current)
roll out new configs and MCD syncs
- current=configB == desired=configB

...

trigger a backup restore pre-upgrade which isn't calling informers
- current=configA == desired=configA
but now what's on disk is still configB

we now need something which retriggers the MCD to sync to configA

configA is what we need anyway, so it must be in the apiserver after a restore

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can i ask: when is a backup restore triggered? i feel like i understand what this pr is supposed to do, but not fully why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's work still ongoing on others to spec out how/when/where an etcd backup is performed (and a subsequent restore)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgwalters can you check my answer above? I think this is working as intended but I'm waiting on a way to test this out

@runcom runcom changed the title WIP: pkg/daemon: add a current config on disk check pkg/daemon: add a current config on disk check Apr 16, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 16, 2019
@runcom
Copy link
Member Author

runcom commented Apr 16, 2019

This is good to review and get in - as it's treated as a bug. The testing goes later once the other teams/steps are complete

@runcom runcom added the jira label Apr 16, 2019
@kikisdeliveryservice
Copy link
Contributor

aws route errors:
/test e2e-aws-op

@kikisdeliveryservice
Copy link
Contributor

I know that this is a bug, but I cant find a BZ/something else for it? Could we add Bug to title?

@runcom
Copy link
Member Author

runcom commented Apr 16, 2019

I know that this is a bug, but I cant find a BZ/something else for it? Could we add Bug to title?

there's no BZ for this, came out from a conversation with Derek on Slack :( not sure we need one tho

@kikisdeliveryservice
Copy link
Contributor

@runcom cool! just wanted to make sure I didn't miss it somewhere.

@kikisdeliveryservice
Copy link
Contributor

weird test didn't re-run

/retest

return nil, nil, err
}
if fingerprintMC.GetName() != currentConfig.GetName() {
return fingerprintMC, currentConfig, nil
Copy link
Member

@cgwalters cgwalters Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I read your comment a few times...I am finding it really hard to capture the flow/state of things in my head right now.

Edit: ignore this
It feels like this would all be a lot simpler to reason about if basically we had /var/machine-config-daemon/config<hash>.json and we kept those for every config version that was relevant to us (current, desired) and used it if it was present instead of talking to the api server.

This would also help address the other BZ about having things be deleted from the apiserver.

That said...after thinking about this more I think you're right, it will work. What I was worried about is "if we're saying currentConfig is desiredConfig, how do we then actually honor the real desiredConfig?". But if they are different then we'll do a config transition and then on the next boot we should transition to the real desiredConfig?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK here's another way to look at it - we're basically saying we can't trust our annotations to describe our current state, the real state is a file on disk. So why are we looking at the annotation at all, versus setting it to the right thing if it's different from what's on disk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why are we looking at the annotation at all, versus setting it to the right thing if it's different from what's on disk?

RIght, I had the same thought indeed, assuming we're talking about the currentConfig (desiredConfig has to come from somewhere/apiserver otherwise we can't really progress right?). So I guess a natural follow up to this would be to avoid talking to the apiserver when querying the currentConfig and just use what it's reflected on disk right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said...after thinking about this more I think you're right, it will work. What I was worried about is "if we're saying currentConfig is desiredConfig, how do we then actually honor the real desiredConfig?". But if they are different then we'll do a config transition and then on the next boot we should transition to the real desiredConfig?

right, that's my understand of what will happen if there's a real drift between currentOnAPI vs currentOnDisk.

Basically, the first sync would reconcile currentOnAPI with currentOnDisk, after we're done with that, there will be another immediate sync to honor desiredConfig (which, in our disaster recovery story is gonna be almost always the same as currentOnAPI, unless you backup an etcd with current != desired at the time you actually backup, right?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like this would all be a lot simpler to reason about if basically we had /var/machine-config-daemon/config<hash>.json and we kept those for every config version that was relevant to us (current, desired) and used it if it was present instead of talking to the api server.

I'm not sure how could we keep desired on disk 🤔 that has always have to come from apiserver right? I feel I'm missing something

Copy link
Contributor

@kikisdeliveryservice kikisdeliveryservice Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or are we saying that in this case the currentConfig is the desired from the fingerprint? would it go from fingerprint->current-> desired? or....?

is there any case where if we have fingerprint, current and desired we would not want to ultimately end up in the desiredConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fingerprint->current-> desired? or....?

exactly that but in the DR case, fingerprint=configB,current=configA,desired=configA. So it's just one real sync+reboot

is there any case where if we have fingerprint, current and desired we would not want to ultimately end up in the desiredConfig?

nope, there shouldn't be any such case afaict. Desired is always evaluated at sync+1 if fingerprint and current drift

Copy link
Member Author

@runcom runcom Apr 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing bugging me too is...why don't we

return fingerprintMC, desiredConfig, nil ?

Uhm, yeah, so using my flow from above comments, would that work in this case where I take a snapshot exactly in the middle of a sync where current != desired (cause yolo!):

current=configA != desired=configB
snapshot

upgrade
- current=configB != desired=configC
roll out new configs and MCD syncs
- current=configC == desired=configC

...

trigger a backup restore pre-upgrade which isn't calling informers
- current=configA != desired=configB
but now what's on disk is still configC

should we first go from configC on disk to configA on disk, and only then configB? Your code would go straight from configC on disk to configB (which avoids a sync, but should we anyway or do we risk missing something?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(which avoids a sync, but should we anyway or do we risk missing something?)

I am not sure what we'd miss. It sounds like you're trying to be more conservative here just on general principle? I don't object to that to be clear. But I don't understand how this would be different from any other config change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, it's really me being just conservative indeed, I guess it would be fine anyway to avoid a sync, so yeah, changing

@runcom runcom removed the jira label Apr 16, 2019
Signed-off-by: Antonio Murdaca <runcom@linux.com>
@runcom
Copy link
Member Author

runcom commented Apr 17, 2019

alrighty, let's see what tests say and then pull the trigger on this for the DR story

@runcom
Copy link
Member Author

runcom commented Apr 17, 2019

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 17, 2019
@kikisdeliveryservice
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kikisdeliveryservice, runcom

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice,runcom]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@runcom
Copy link
Member Author

runcom commented Apr 18, 2019

/retest

@openshift-merge-robot openshift-merge-robot merged commit 0314083 into openshift:master Apr 18, 2019
@runcom runcom deleted the current-fingerprint branch April 18, 2019 20:47
runcom added a commit to runcom/machine-config-operator that referenced this pull request Apr 18, 2019
Mimick (and copy from) pkg/controller testing, hopefully one day I'll spend
a week trying to abstract all the duplicate code in pkg/controller
testings and now pkg/daemon...but not today.

This patch is adding tests for
openshift#612 which
would greatly benefit from some testing.

This can be used as a start to add more tests *hint hint*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants