Skip to content
This repository has been archived by the owner on Jul 11, 2023. It is now read-only.

Delete Certificate for an Envoy for a Pod that was Terminated #1956

Merged
merged 5 commits into from
Nov 17, 2020

Conversation

draychev
Copy link
Contributor

@draychev draychev commented Oct 29, 2020

The goal of this PR is to remove certificates for Pods that have been terminated. Without this we would continue to issue certificates and keep them in cache for the life of the OSM Controller pod.

This PR introduces 2 new functions (within Mesh Catalog):

  • releaseCertificate()
  • updateRelatedProxies()

These functions will be evaluated (sequentially) when an Announcement of type PodDeleted arrives.

To help find the certificate for a given Pod (identified by its Kubernetes UID), this PR introduces podUIDToCN into Mesh Catalog. This is a thread safe map, which will be populated with Pod UID to Certificate mapping.


This PR will fix #1719

After this  I am going to hand-off the work around refining and rearchitecting the OSM eventing to @eduser25 and @shashankram who will continue to hack on the signaling mechanism.

I hope that the changes I made here would be easily translated to whatever system we come up with to handle these events in the future. I deliberately did not go into future-proofing this - we could add the entire Kubernetes object in the Announcement or launch the handlers in goroutines to unblock.  I'm leaving this for another iteration (and if at all necessary).


Affected area:

  • New Functionality [ ]
  • Documentation [ ]
  • Install [ ]
  • Control Plane [ ]
  • CLI Tool [ ]
  • Certificate Management [ ]
  • Networking [ ]
  • Metrics [ ]
  • SMI Policy [ ]
  • Security [ ]
  • Tests [ ]
  • CI System [ ]
  • Performance [ ]
  • Other [ ]

Please answer the following questions with yes/no.

  • Does this change contain code from or inspired by another project? If so, did you notify the maintainers and provide attribution?

@draychev
Copy link
Contributor Author

ref #1719

@draychev draychev force-pushed the cleanup-stale-certs--1 branch 2 times, most recently from 2848660 to 3fce8c8 Compare November 4, 2020 23:57
@draychev draychev changed the title catalog: Cleanup certificates for deleted pods Delete Certificate for an Envoy for a Pod that was Terminated Nov 4, 2020
@draychev draychev force-pushed the cleanup-stale-certs--1 branch 8 times, most recently from cc513be to f9b0d3c Compare November 11, 2020 02:52
@codecov-io
Copy link

codecov-io commented Nov 11, 2020

Codecov Report

Merging #1956 (12cf167) into main (d4d0a63) will increase coverage by 0.07%.
The diff coverage is 44.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1956      +/-   ##
==========================================
+ Coverage   57.30%   57.37%   +0.07%     
==========================================
  Files         139      140       +1     
  Lines        5692     5739      +47     
==========================================
+ Hits         3262     3293      +31     
- Misses       2427     2443      +16     
  Partials        3        3              
Impacted Files Coverage Δ
pkg/catalog/repeater.go 27.58% <0.00%> (-3.19%) ⬇️
...ertificate/providers/tresor/certificate_manager.go 60.63% <0.00%> (-0.66%) ⬇️
pkg/envoy/ads/grpc.go 0.00% <0.00%> (ø)
pkg/envoy/ads/stream.go 0.00% <0.00%> (ø)
pkg/envoy/proxy.go 26.47% <0.00%> (ø)
pkg/catalog/announcement_handlers.go 44.44% <44.44%> (ø)
pkg/envoy/ads/response.go 78.20% <75.00%> (-0.47%) ⬇️
pkg/catalog/catalog.go 100.00% <100.00%> (ø)
pkg/catalog/proxy.go 100.00% <100.00%> (ø)
pkg/endpoint/providers/kube/fake.go 70.83% <0.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d4d0a63...12cf167. Read the comment docs.

@draychev draychev force-pushed the cleanup-stale-certs--1 branch 4 times, most recently from bbaebee to a8aafeb Compare November 11, 2020 20:27
@draychev draychev force-pushed the cleanup-stale-certs--1 branch 10 times, most recently from c568549 to 0ed5cbc Compare November 12, 2020 22:15
@draychev draychev force-pushed the cleanup-stale-certs--1 branch from 0ed5cbc to 85787b4 Compare November 12, 2020 22:19
@draychev draychev marked this pull request as ready for review November 12, 2020 22:25
@draychev draychev requested a review from a team as a code owner November 12, 2020 22:25

var errEventNotHandled = errors.New("event not handled")

func (mc *MeshCatalog) releaseCertificate(ann announcements.Announcement) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the function is not exported but would you mind adding a comment for this function anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to document these! Added a docstring here!


func (mc *MeshCatalog) releaseCertificate(ann announcements.Announcement) error {
whatWeGot := ann.Type
whatWeCanHandle := announcements.PodDeleted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

love these variable names

shashankram
shashankram previously approved these changes Nov 13, 2020
@draychev draychev requested a review from michelleN November 13, 2020 23:01
pkg/catalog/proxy.go Outdated Show resolved Hide resolved
snehachhabria
snehachhabria previously approved these changes Nov 16, 2020
@draychev draychev dismissed stale reviews from snehachhabria and shashankram via 90e7993 November 16, 2020 21:04
@draychev draychev merged commit bf58d62 into openservicemesh:main Nov 17, 2020
@draychev draychev deleted the cleanup-stale-certs--1 branch November 17, 2020 15:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cleanup certificates for terminated Pods from OSM controller cache
6 participants