Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

Feature: CLI list deployments shows health for each entry #1594

Merged
merged 10 commits into from
Jun 10, 2021

Conversation

izaaklauer
Copy link
Contributor

@izaaklauer izaaklauer commented Jun 3, 2021

Previously, running waypoint deployment list would attach the most recent health check of any kind to the most recent deployment. With this change, we display the health of each deployment listed.

Example:
Screen Shot 2021-06-03 at 6 48 31 PM

In writing this, I realize that our ListStatusReports endpoint lists all status reports for all resources, with no way to filter to a specific resource or recency limit. We'll probably need to enhance that pretty soon as we build up more report types and volume.

Also it's a big protobuf diff, but it should get us back to the the code produced by the version of protoc locked in by go.mod.

@izaaklauer izaaklauer changed the title CLI list deployments shows health for each entry Feature: CLI list deployments shows health for each entry Jun 3, 2021
@github-actions github-actions bot added the core label Jun 3, 2021
@@ -345,6 +344,9 @@ func (op *statusReportOperation) Do(
return nil, status.Errorf(codes.FailedPrecondition, "unsupported status operation type")
}

// Add the time generated to the outer status report
realMsg.TimeGenerated = report.TimeGenerated
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like something the UI will likely want to include (cc @jgwhite ), so I added it to the top level StatusReport. It also means we don't need to dig into the lower-level status report in the CLI at all anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is going to be super helpful in the UI too.

@izaaklauer izaaklauer requested a review from a team June 3, 2021 22:54
Copy link
Contributor

@jgwhite jgwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great!

image

One small comment on naming consistency, but otherwise looks good to me.

for _, statusReport := range statusReportsResp.StatusReports {
if deploymentTargetId, ok := statusReport.TargetId.(*pb.StatusReport_DeploymentId); ok {
if deploymentTargetId.DeploymentId == b.Id {
switch statusReport.Health.HealthStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see we’re switching on the outer HealthStatus rather than the SDK’s Health enum.

internal/server/proto/server.proto Outdated Show resolved Hide resolved
Copy link
Contributor

@krantzinator krantzinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to have this! I had a couple questions and one nit.

internal/cli/deployment_list.go Show resolved Hide resolved
internal/cli/deployment_list.go Outdated Show resolved Hide resolved
@@ -345,6 +344,9 @@ func (op *statusReportOperation) Do(
return nil, status.Errorf(codes.FailedPrecondition, "unsupported status operation type")
}

// Add the time generated to the outer status report
realMsg.GeneratedTime = report.TimeGenerated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:chefs-kiss:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is one AB and the other BA (A = "Generated" B = "Time")?

Renaming fields is backwards compatible in proto so should we rename fields?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch...yeah these should be named the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, I actually just changed this after @jgwhite caught that many similar time fields that the server returns are in the AB order, and it might be better to be consistent with other server fields than with the sdk field that we're mirroring: #1594 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If renaming fields is backwards compatible, fixing the SDK field may be the way to go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I noticed Jamies comment. We could flip them in the SDK if it doesn't break backwards compatibility. It's too bad this wasn't caught sooner 😄

Copy link
Member

@briancain briancain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweeeet, thanks for the fix izaak 😄 A couple of minor comments from me and some +1s on the other feedback given 👍🏻

internal/cli/deployment_list.go Outdated Show resolved Hide resolved
internal/server/proto/server.proto Outdated Show resolved Hide resolved
Copy link
Member

@briancain briancain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izaaklauer I think we should fix the GeneratedTime = TimeGenerated issue here. We should either flip it back to TimeGenerated on the copied convenience server proto var to match the option in the raw report (and be a little inconsistent), or try to fix the name on the SDK side if it doesn't break backwards compatibility so it is consistent with other parts of the API. If we do change it, it'd be good to get it in a 0.4.1 before anyone really builds anything on top of this to reduce major breaking changes.

izaaklauer added a commit to hashicorp/waypoint-plugin-sdk that referenced this pull request Jun 9, 2021
Matches our convention of other time values.

Prerequisite for hashicorp/waypoint#1594 (discussion hashicorp/waypoint#1594 (review))
@izaaklauer
Copy link
Contributor Author

PR to rename the time field in waypoint-plugin-sdk: hashicorp/waypoint-plugin-sdk#35. I tested using that change in a new server with an old version of the CLI and vice-versa, and the backwards compatibility works! Once the sdk change is merged, I'll update this pr to include it.

@briancain
Copy link
Member

Nice @izaaklauer ! You'll probably also need to update all of the plugins to use the new variable too when they finish generating a report, right? Maybe that can be done in a new PR when we update the SDK version this project is using.

izaaklauer and others added 9 commits June 9, 2021 10:53
In writing this, I realize that our ListStatusReports endpoing lists _all_ status reports for all resources, with no way to filter. We'll probably need to enhance that pretty soon as we build up more report types and volume.

Also it looks like this is a big protobuf diff. The existing generated code has "unknown" protobuf version, so i'm not sure what version I need to revert to to make it match.
Co-authored-by: Jamie White <jamie@jgwhite.co.uk>
Previously, if a plugin does not return status reports, we would say "Unknown Status", which is a bit wordy and also conflicts with the "UNKNOWN" official status, where a plugin _has tried_ to get status and is explicitly returning unknown.

"n/a" kind of implies that the very concept of health doesn't apply to some deployments, which isn't right either, but it's the least-worst thing I can think of.
@izaaklauer
Copy link
Contributor Author

izaaklauer commented Jun 9, 2021

Nice @izaaklauer ! You'll probably also need to update all of the plugins to use the new variable too when they finish generating a report, right? Maybe that can be done in a new PR when we update the SDK version this project is using.

The other plugin changes weren't too big and addresses this feedback, so I looped it in here. We are now consistently using <thing>_time naming convention.

@izaaklauer izaaklauer requested a review from briancain June 9, 2021 15:12
@izaaklauer
Copy link
Contributor Author

Note that because with this change we're recording and reading GeneratedTime from the outer protobuf, and reading HealthStatus from the outer protobuf. We used to read these from the inner sdk pb.any protobuf. This means that deployment list commands with this release will not show a time or health for old status reports. Example:

Screen Shot 2021-06-09 at 2 48 10 PM

(deployment 3 was made with 0.4.0, and 4 with this new version. Note that 3 is missing health info).

The UI is also unaffected, because it doesn't inspect the inner sdk proto.

With the current implementation of health checks, health is only determined immediately after a deployment, and is most valuable at that point, so losing that historical health check data in this CLI view may be acceptable. The next deployment that anyone runs after upgrading waypoint will show health on deployment list.

I would argue that it's not worth the complexity to check the SDK proto in the case where the outer proto is blank, but I'm not wedded to that opinion.

Copy link
Member

@briancain briancain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants