Skip to content

Commit

Permalink
Merge pull request #422 from wking/raise-operator-leveling-timeout
Browse files Browse the repository at this point in the history
Bug 1862524: pkg/cvo/status: Raise Operator leveling grace-period to 20 minutes
  • Loading branch information
openshift-merge-robot authored Aug 1, 2020
2 parents 497fda6 + 08d5c42 commit ed864d6
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/user/status.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ If this happens it is a CVO coding error, because clearing [`desiredUpdate`][api
`ClusterOperatorNotAvailable` (or the consolidated `ClusterOperatorsNotAvailable`) is set when the CVO fails to retrieve the ClusterOperator from the cluster or when the retrieved ClusterOperator does not satisfy [the reconciliation conditions](reconciliation.md#clusteroperator).

Unlike most manifest-reconciliation failures, this error does not immediately result in `Failing=True`.
Under some conditions during installs and updates, the CVO will treat this condition as a `Progressing=True` condition and give the operator up to ten minutes to level before reporting `Failing=True`.
Under some conditions during installs and updates, the CVO will treat this condition as a `Progressing=True` condition and give the operator up to twenty minutes to level before reporting `Failing=True`.

## RetrievedUpdates

Expand Down
4 changes: 2 additions & 2 deletions pkg/cvo/status.go
Original file line number Diff line number Diff line change
Expand Up @@ -344,13 +344,13 @@ func (optr *Operator) syncStatus(ctx context.Context, original, config *configv1

// convertErrorToProgressing returns true if the provided status indicates a failure condition can be interpreted as
// still making internal progress. The general error we try to suppress is an operator or operators still being
// unavailable AND the general payload task making progress towards its goal. An operator is given 10 minutes since
// unavailable AND the general payload task making progress towards its goal. An operator is given 20 minutes since
// its last update to go ready, or an hour has elapsed since the update began, before the condition is ignored.
func convertErrorToProgressing(history []configv1.UpdateHistory, now time.Time, status *SyncWorkerStatus) (reason string, message string, ok bool) {
if len(history) == 0 || status.Failure == nil || status.Reconciling || status.LastProgress.IsZero() {
return "", "", false
}
if now.Sub(status.LastProgress) > 10*time.Minute || now.Sub(history[0].StartedTime.Time) > time.Hour {
if now.Sub(status.LastProgress) > 20*time.Minute || now.Sub(history[0].StartedTime.Time) > time.Hour {
return "", "", false
}
uErr, ok := status.Failure.(*payload.UpdateError)
Expand Down

0 comments on commit ed864d6

Please sign in to comment.