Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancements/update/targeted-update-edge-blocking: Propose a new enhancement #821

Conversation

wking
Copy link
Member

@wking wking commented Jul 2, 2021

This enhancement proposes a mechanism for blocking edges for the subset of clusters considered vulnerable to known issues with a particular update or target release.

This is the third iteration of this proposal, replacing #426 (update-service-side decisions based on Telemetry queries) and #737 (update-service-side decisions based on client-supplied query parameters). I'm now recommending cluster-side decisions, with the update service declaring conditional edges and telling the cluster-side tools how to decide if the cluster is vulnerable to the known bugs impacting its update decisions.

@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch 2 times, most recently from 95efba9 to e664c70 Compare July 2, 2021 04:39
@wking wking changed the title enhancements/update/targeted-update-edge-blocking: Propose a new enhancement WIP: enhancements/update/targeted-update-edge-blocking: Propose a new enhancement Jul 2, 2021
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 2, 2021
@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch from e664c70 to 61c6c45 Compare July 2, 2021 05:14
@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch from 61c6c45 to a034a69 Compare July 2, 2021 18:28
@sdodson
Copy link
Member

sdodson commented Jul 7, 2021

@wking I'm unsure if we should add this to alternatives section or where you'd like to address it, however I think it'd be great to think through how we can enable future cluster agnostic graph viewers. I'm a bit worried that the suggested implementation relies a bit too much on client side merging of multiple graphs which would complicate our ability to visualize edges into Blocked, Conditionally Blocked, In Progress (assuming phased rollouts) buckets. Such viewers wouldn't need to be capable of evaluating any conditional blocks but would be expected to present the conditions.

@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch from a034a69 to 97c2bf1 Compare July 14, 2021 03:51
@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch from 97c2bf1 to 66ecf6a Compare July 21, 2021 14:49
Copy link
Member

@sdodson sdodson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a few open threads with questions but I think this is probably ready to have WIP removed. Would like to see affirmative support from other CVO maintainers that the level of complexity in the CVO is necessary.

enhancements/update/targeted-update-edge-blocking.md Outdated Show resolved Hide resolved
enhancements/update/targeted-update-edge-blocking.md Outdated Show resolved Hide resolved
@jottofar
Copy link
Contributor

Still a few open threads with questions but I think this is probably ready to have WIP removed. Would like to see affirmative support from other CVO maintainers that the level of complexity in the CVO is necessary.

I don't find the suggested CVO changes overly complex. Seems to me a reasonable outgrowth of what we already have in CVO.

@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch 2 times, most recently from f853d4a to fe5073a Compare August 27, 2021 19:53
@LalatenduMohanty
Copy link
Member

LalatenduMohanty commented Aug 27, 2021

We need a section for what it means for a disconnected cluster (as they might not have telemetry), when we release a cincinnati-graph-data image for OSUS operator setup.

@wking wking force-pushed the targeted-edge-blocking-cluster-side-prom branch from fe5073a to beeee98 Compare August 27, 2021 20:55
wking added a commit to wking/openshift-api that referenced this pull request Sep 21, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 21, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 22, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 23, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
wking added a commit to wking/openshift-api that referenced this pull request Sep 23, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

* The server-side apply annotations documented in [8]: listType,
  listMapKey, patchMergeStrategy, and patchMergeKey.

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
[8]: https://kubernetes.io/docs/reference/using-api/server-side-apply/
wking added a commit to wking/openshift-enhancements that referenced this pull request Sep 23, 2021
wking added a commit to wking/openshift-enhancements that referenced this pull request Sep 23, 2021
…anded

Following up on openshift#821 with the changes made in order to land
openshift/api#1011.

The graph-data properties remain optional, because graph-data has been
used for unconditional blocks that do not declare risks today.  The
update service will continue to read those blocks, but will implement
them by pruning 'edges'.  So things that end up in 'conditionalEdges'
will still need all the metadata populated to keep David happy ;).
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 23, 2021
So we can explain why we're blocking the different edges [1] (the
promql -> PromQL type change is in flight with [2]).

The zz in the filename for the vSphere hostname block ensures that one
sorts last alphabetically, because it's the broadest block, and legacy
Cincinnati services will prefer the final regular expression they load
for a given 'to' target.

This is basically a second attempt at my earlier 39bc2fb
(blocked-edges/4.7.4*: Targeted edge blocking and version 1.1.0,
2021-09-01, openshift#1056), which ended up getting reverted in da1254a
(Revert "blocked-edges/4.7.4*: Targeted edge blocking and version
1.1.0", 2021-09-21, openshift#1078) because the production Cincinnati was mad
about the 1.1.0 version string.  [3] taught Cincinnati to relax, and
now that's live (and we never shipped any versions that would be mad
about 1.1.0 to customers, the 4.6.0 Update Service operator went out
before [4]).

[1]: openshift/enhancements#821
[2]: openshift/enhancements#910
[3]: openshift/cincinnati#538
[4]: openshift/cincinnati#314
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 23, 2021
To keep us from forgetting to set 'url', etc. [1,2].

We'll probably want to have PromQL parser validation later [3], but I
write Python a lot faster than I write Go.  This should avoid obvious
mistakes while the PromQL parser gets wired up to a validator.

[1]: openshift/enhancements#821
[2]: openshift/enhancements#910
[3]: https://github.com/openshift/enhancements/blob/4412b51182ba6e007f1308a73e290e81fd66d092/enhancements/update/targeted-update-edge-blocking.md#promql-validation
m-yosefpor pushed a commit to m-yosefpor/enhancements that referenced this pull request Sep 24, 2021
…anded

Following up on openshift#821 with the changes made in order to land
openshift/api#1011.

The graph-data properties remain optional, because graph-data has been
used for unconditional blocks that do not declare risks today.  The
update service will continue to read those blocks, but will implement
them by pruning 'edges'.  So things that end up in 'conditionalEdges'
will still need all the metadata populated to keep David happy ;).
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 27, 2021
So we can explain why we're blocking the different edges [1] (the
promql -> PromQL type change is in flight with [2]).

The zz in the filename for the vSphere hostname block ensures that one
sorts last alphabetically, because it's the broadest block, and legacy
Cincinnati services will prefer the final regular expression they load
for a given 'to' target.

This is basically a second attempt at my earlier 39bc2fb
(blocked-edges/4.7.4*: Targeted edge blocking and version 1.1.0,
2021-09-01, openshift#1056), which ended up getting reverted in da1254a
(Revert "blocked-edges/4.7.4*: Targeted edge blocking and version
1.1.0", 2021-09-21, openshift#1078) because the production Cincinnati was mad
about the 1.1.0 version string.  [3] taught Cincinnati to relax, and
now that's live (and we never shipped any versions that would be mad
about 1.1.0 to customers, the 4.6.0 Update Service operator went out
before [4]).

[1]: openshift/enhancements#821
[2]: openshift/enhancements#910
[3]: openshift/cincinnati#538
[4]: openshift/cincinnati#314
wking added a commit to wking/cincinnati-graph-data that referenced this pull request Sep 27, 2021
So we can explain why we're blocking the different edges [1] (the
promql -> PromQL type change is in flight with [2]).

The zz in the filename for the vSphere hostname block ensures that one
sorts last alphabetically, because it's the broadest block, and legacy
Cincinnati services will prefer the final regular expression they load
for a given 'to' target.

This is basically a second attempt at my earlier 39bc2fb
(blocked-edges/4.7.4*: Targeted edge blocking and version 1.1.0,
2021-09-01, openshift#1056), which ended up getting reverted in da1254a
(Revert "blocked-edges/4.7.4*: Targeted edge blocking and version
1.1.0", 2021-09-21, openshift#1078) because the production Cincinnati was mad
about the 1.1.0 version string.  [3] taught Cincinnati to relax, and
now that's live (and we never shipped any versions that would be mad
about 1.1.0 to customers, the 4.6.0 Update Service operator went out
before [4]).

[1]: openshift/enhancements#821
[2]: openshift/enhancements#910
[3]: openshift/cincinnati#538
[4]: openshift/cincinnati#314
EmilyM1 pushed a commit to EmilyM1/api that referenced this pull request Oct 12, 2021
Implementing targeted edge blocking as described in [1].  Except:

* Some properties which are optional in the enhancement are required
  here.  I think this is strictly worse handling for the
  cluster-version operator, because we are just getting this data from
  the upstream update service's Cincinnati JSON, and if the upstream
  says "here's a risk, but I am not linking a URI", and the CVO
  subsequently rejects the entire update payload as invalid, then
  downstream consumers are not hearing about the
  reason/message/matchingRules that the upstream may have been passing
  along.  But I'm not an API approver, and David is asking for the
  +required in [2,3,4,5].  I dunno why he isn't finding my argument's
  convincing, but I'm out of time to continue arguing my position.

* Similarly, the ClusterCondition.Type property has an enum, because
  David requires it [6].  This means that the CVO will need to
  self-censor if there are unrecognized types (because we don't want
  our ClusterVersion writes rejected on API-server validation
  failures).  That means that instead of saying "... and there was
  another rule $UNRECOGNIZED_TYPE that I don't know how to evaluate",
  the CVO will have to silently drop that rule.  And if that's the
  only rule, the CVO will have to silently drop that risk and
  conditional update.  David is fine with that, even if I'm not on
  board, and he's the API approver, so that's what we'll do.

  I've also upcased the types to Always and PromQL; from David:

    Always and PromQL.  We use CamelCase for enumerated values

* history[].overrides became history[].acceptedRisks to satisfy
  David's concerns about misinterpretation [7].

* The server-side apply annotations documented in [8]: listType,
  listMapKey, patchMergeStrategy, and patchMergeKey.

Swagger, deepcopy, and CRD changes were generated with:

  $ make update

[1]: openshift/enhancements#821
[2]: openshift#1011 (comment)
[3]: openshift#1011 (comment)
[4]: openshift#1011 (comment)
[5]: openshift#1011 (comment)
[6]: openshift#1011 (comment)
[7]: openshift#1011 (comment)
[8]: https://kubernetes.io/docs/reference/using-api/server-side-apply/
wking added a commit to wking/openshift-enhancements that referenced this pull request Apr 19, 2022
…endations"

Users are occasionally confused by the "blocked edges" wording, which
is internal shorthand.

The fact that update recommendations are tracked as a directed graph
(hence "edges") isn't all that relevant for most users, so talk about
"updates" instead of "edges".

While we occasionally drop update recommendations to reduce the
chances that users update into serious issues, from [1]:

  When an update recommendation is supported, it remains supported for
  the life of 4.10, even if the update recommendation is later dropped
  or made conditional.

Updating is still possible, we just don't think it's a good idea
(unless maybe you're elbow deep in a support case, and a support tech
decides to recommend an update based on context that is not available
to the update service).  So replace "blocked" with "not recommended",
to convey the "still possible" aspect.

Also ask for some more specifics in the impact statement template, as
we gear up to support conditional updates fff9d3c
(enhancements/update/targeted-update-edge-blocking: Propose a new
enhancement, 2020-08-07, openshift#821).  I haven't gone as far as "tell us
what PromQL to use in our conditional recommendation", but I'm easing
things in that direction.

[1]: https://docs.openshift.com/container-platform/4.10/updating/understanding-upgrade-channels-release.html#upgrade-version-paths_understanding-upgrade-channels-releases
wking added a commit to wking/openshift-enhancements that referenced this pull request Apr 19, 2022
Because he approved the original pull request creating the directory
[1].

[1]: openshift#821 (comment)
wking added a commit to wking/openshift-enhancements that referenced this pull request Apr 20, 2022
Because he approved the original pull request creating the directory
[1].

[1]: openshift#821 (comment)
wking added a commit to wking/openshift-enhancements that referenced this pull request Apr 20, 2022
…endations"

Users are occasionally confused by the "blocked edges" wording, which
is internal shorthand.

The fact that update recommendations are tracked as a directed graph
(hence "edges") isn't all that relevant for most users, so talk about
"updates" instead of "edges".

While we occasionally drop update recommendations to reduce the
chances that users update into serious issues, from [1]:

  When an update recommendation is supported, it remains supported for
  the life of 4.10, even if the update recommendation is later dropped
  or made conditional.

Updating is still possible, we just don't think it's a good idea
(unless maybe you're elbow deep in a support case, and a support tech
decides to recommend an update based on context that is not available
to the update service).  So replace "blocked" with "not recommended",
to convey the "still possible" aspect.

Also ask for some more specifics in the impact statement template, as
we gear up to support conditional updates fff9d3c
(enhancements/update/targeted-update-edge-blocking: Propose a new
enhancement, 2020-08-07, openshift#821).  I haven't gone as far as "tell us
what PromQL to use in our conditional recommendation", but I'm easing
things in that direction.

[1]: https://docs.openshift.com/container-platform/4.10/updating/understanding-upgrade-channels-release.html#upgrade-version-paths_understanding-upgrade-channels-releases
wking added a commit to wking/openshift-enhancements that referenced this pull request Apr 20, 2022
…endations"

Users are occasionally confused by the "blocked edges" wording, which
is internal shorthand.

The fact that update recommendations are tracked as a directed graph
(hence "edges") isn't all that relevant for most users, so talk about
"updates" instead of "edges".

While we occasionally drop update recommendations to reduce the
chances that users update into serious issues, from [1]:

  When an update recommendation is supported, it remains supported for
  the life of 4.10, even if the update recommendation is later dropped
  or made conditional.

Updating is still possible, we just don't think it's a good idea
(unless maybe you're elbow deep in a support case, and a support tech
decides to recommend an update based on context that is not available
to the update service).  So replace "blocked" with "not recommended",
to convey the "still possible" aspect.

Also ask for some more specifics in the impact statement template, as
we gear up to support conditional updates fff9d3c
(enhancements/update/targeted-update-edge-blocking: Propose a new
enhancement, 2020-08-07, openshift#821).

[1]: https://docs.openshift.com/container-platform/4.10/updating/understanding-upgrade-channels-release.html#upgrade-version-paths_understanding-upgrade-channels-releases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants