Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

Record and display count of entrypoint config connections in deployment status report #3043

Merged
merged 1 commit into from
Mar 17, 2022

Conversation

catsby
Copy link
Contributor

@catsby catsby commented Feb 24, 2022

This PR adds the count of entrypoint connections streaming configuration updates per deployment to deployment status reports. At this time there is no practical way of correlating the gRPC connection to a specific instance or pod in a deployment, so for now we simply display the current count of connections.

There are some caveats or limitations here:

  • To my knowledge this is the only situation where we include information from the Waypoint server into a status report of a deployment/app/project. We do this by first generate a status report, the logic for which is implemented in the plugins and executed by a runner. After completion we capture the report and add the count by querying the instances for a deployment from the Waypoint server, and then modify the status report and re-upsert it with the count. The approach taken is a bit naive and a little brute force-ish for now, and only applies to status reports for deployments (not releases, etc). If we decide to include future metrics from the Waypoint server in status reports, we may want to consider a more elegant/flexible solution that could work for all report types, but for now we don't anticipate any more additions.
  • As mentioned we have no easy way of correlating a instance recorded on the server as receiving entrypoint configuration updates and say a specific pod in a deployment, so at this time we can't deduce which pods are connected and which are not.
  • The server doesn't immediately know how many deployment "instances" there should be, as things like replica_count are handled by the plugin.
  • The burden is unfortunately left to the user to know if the count is correct, but hopefully surfacing the count can be a helpful indicator in debugging applications
  • Extracting information about Entrypoint being disabled, either by using disable_entrypoint in the build or by using the environment variable WAYPOINT_CEB_DISABLE was rather difficult, so I settled with displaying the count for now. Perhaps a follow up PR could come with this information after I circle back internally and this PR is reviewed.

Example of output before this change:

» Deployment Summary
APP NAME        VERSION WORKSPACE       PLATFORM        ARTIFACT        LIFECYCLE STATE
web             v5      default         kubernetes      id:5            SUCCESS

After:

» Deployment Summary
APP NAME        VERSION WORKSPACE       PLATFORM        ARTIFACT        LIFECYCLE STATE ENTRYPOINT CONNECTIONS
web             v1      default         kubernetes      id:1            SUCCESS         1

Full output:

Current status for application "web" in project "nginx-project" in server context "localhost:9701".

» Application Summary
APP     WORKSPACE       DEPLOYMENT STATUS       DEPLOYMENT CHECKED      RELEASE STATUS  RELEASE CHECKED
web     default         ✔ READY                 now                     ✔ READY         now


» Deployment Summary
APP NAME        VERSION WORKSPACE       PLATFORM        ARTIFACT        LIFECYCLE STATE ENTRYPOINT CONNECTIONS
web             v1      default         kubernetes      id:1            SUCCESS         1


» Deployment Resources Summary
TYPE            NAME                    PLATFORM        HEALTH  TIME CREATED
deployment      web-v1                  kubernetes      READY   31 seconds ago
pod             web-v1-77d95c4cd-b74t2  kubernetes      READY   31 seconds ago


» Release Summary
APP NAME        VERSION WORKSPACE       PLATFORM        ARTIFACT        LIFECYCLE STATE
web             v1      default         kubernetes      id:1            SUCCESS


» Release Resources Summary
TYPE    NAME    PLATFORM        HEALTH  TIME CREATED
service web     kubernetes      READY   19 seconds ago

@github-actions github-actions bot added the core label Feb 24, 2022
@catsby catsby requested a review from a team February 24, 2022 16:33
Copy link
Contributor

@evanphx evanphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only comment is if we should call this entrypoint. I'd probably call it instance.

Copy link
Member

@briancain briancain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! I wrote out a question that we both had that hopefully the team can think about and answer. Otherwise makes sense 😄

// in case that changes.
if report != nil {
// check if we have any connected instances
resp, err := a.client.ListInstances(ctx, &pb.ListInstancesRequest{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a question here for the wider team - It would be nice if Status Reports had some kind of pre-Upsert hook for adding information like active entrypoint connections to a report that a plugin typically wouldn't have access to at the time the Status Report is generated. @catsby and I figured for now since it's just one extra piece of data, it's probably OK to do a double Upsert and append this data later after the plugin has already upserted its report.

In the future though, it would be nice if we had something a bit more generic in place with our system to add data like this. But if someone has a good idea of how to easily achieve that, we're all ears!

Finally - maybe a double Upsert isn't a huge deal here if the cost is low to modify an existing report in the db with a little bit of extra information here.

@catsby
Copy link
Contributor Author

catsby commented Feb 25, 2022

My only comment is if we should call this entrypoint. I'd probably call it instance.

@evanphx I can see the reasoning behind this. On the other hand, this count represents the number of instances that have established a connection with the Waypoint server. If for instance there's an issue and the entrypoint cannot connect to the Waypoint server (firewall rules, et. al.) then the count would be zero, which could imply that no instances are running. Does that make sense?

Copy link
Contributor

@izaaklauer izaaklauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

In the future, it would be nice to build on this this to change the status report's overall health if we don't have instances and shouldn't have instances. I don't think many new users will know enough to check that field under status reports and recognize it means they need to look at their firewall rules.

Totally recognize "shouldn't" is the hard part - there are multiple ways to disable the CEB (at build-time or deploy-time), and they aren't all visible to us now at this point where we're adding the connected instance count.


// count of active entrypoint config connections. This is currently only
// applicable to deployment type status reports
uint32 entrypoint_config_connections = 14;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just name this instances_count cause that's what we're measuring. If we change how we measure that in the future its less leaky than "entrypoint config connections."

@hashicorp-cla
Copy link

hashicorp-cla commented Mar 12, 2022

CLA assistant check
All committers have signed the CLA.

Query active instances connections for deployment status reports
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants