-
Notifications
You must be signed in to change notification settings - Fork 234
feat: Add the org and space based certificates for syslog drains #1229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add the org and space based certificates for syslog drains #1229
Conversation
| ### Configuration Flow | ||
|
|
||
| - When an app binds a syslog drain, the Cloud Controller should include org and space GUIDs in the drain metadata. | ||
| - The system retrieves a Certifacte Authority for that org or space from a binding. | ||
| - The drain connection must use this certificate for TLS/mTLS authentication. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add some words about the backwards compatibility of this flow with the current flow to create and consume a syslog drain.
|
|
||
| ## Summary | ||
|
|
||
| This RFC proposes to introduce support for **organization- and space-scoped client certificates** for Cloud Foundry Loggregator syslog drains using mutual TLS (mTLS), covering both **HTTPS** and **syslog+TLS** protocols. By issuing certificates at the org or space level instead of per application, this initiative will simplify certificate lifecycle management, enable centralized rotation, and facilitate integration with central certificate authorities. The change targets the reduction of operational overhead and the enhancement of tenant-level security. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding "By issuing certificates at the org or space level instead of per application". Do you mean for the syslog drain use case? The https://docs.cloudfoundry.org/devguide/deploy-apps/instance-identity.html feature is used in different use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this is only for the certificates used for Syslog drains and has nothing to do with the instance identity certificates.
rkoster
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will exiting connections drained and reestablished after an updated certificate?
stephanme
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add sections with proposed changes per CF component (e.g. Cloud Controller, loggregator, CF CLI). This helps to better understand the consequences of this RFC and required efforts. It also makes it easier to invite the affected WG areas for a review.
| ```bash | ||
| cf create-user-provided-service SPACE-NAME -p '{"ca":"-----BEGIN CERTIFICATE-----\nMIIH...-----END CERTIFICATE-----", "cert":"-----BEGIN CERTIFICATE-----\nMIIH...-----END CERTIFICATE-----","key":"-----BEGIN PRIVATE KEY-----\nMIIE...-----END PRIVATE KEY-----"}' | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These example configurations show the creation of user provided service instances that happen to contain certificates. How does it relate to syslog drains as documented in https://docs.cloudfoundry.org/devguide/services/log-management.html ?
Can you provide a complete example that shows e.g. how multiple apps in a space use the same space-scope certificate for a syslog drain binding and maybe even how you would rotate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephanme The call above will create credentials which have to be bound to the app. The problem with this approach is that these credentials are available to the app via the VCAP_SERVICES and will not be available to the Syslog Agent when creating the Syslog Drain. If this is accepted an app will have to somehow share the credentials with the Syslog Agent which will be hard to do as the Syslog Agent is a special kind of user provided service implemented as part of CF and not via an external service broker. An external broker could have collected all needed credentials and open a connection, but this is not the case.
chombium
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, we don't need this change. What we need is a better documentation how to achieve this with the current way how syslog drains are created and work and the other CF tools we have at hand.
|
I understand the problem and complexity of managing certificates which this RFC tres to address. At the moment, with the current implementation, one can create a single user provided service defining a Syslog Drain and bind it to multiple apps. flowchart LR
app1["app 1"]
ups["User Provided Service"] --> b1["service binding 1"] --> sl1["Syslog Drain 1"]
app1 --> b1
app2["app 2"]
ups["User Provided Service"] --> b2["service binding 2"] --> sl2["Syslog Drain 2"]
app2 --> b2
app3["app 3"]
ups["User Provided Service"] --> b3["service binding 3"] --> sl3["Syslog Drain 3"]
app3 --> b3
appn["app n"]
ups["User Provided Service"] --> bn["service binding n"] --> sln["Syslog Drain n"]
appn --> bn
This way one can share the same Syslog Drain certificates between apps. We've also used to have a cf drain CLI plugin for managing The Syslog Drains are created as user provided services and hence are scoped to a space. One can share a service between spaces in an org with The only possible downsides of this approach are:
This RFC:
IMHO, we don't need this change. What we need is a better documentation how to achieve this. Here is an example of the whole flow: # target org1 and space1
cf target -o org1 -s space1
# create the syslog drain
cf cups mydrain -l https://my-drain.example.com -p {...}
# bind app1 to mydrain
cf bind-service app1 mydrain
# share the service with space2
cf share-service mydrain -s space2
# share the service with space3 in org2
cf share-service mydrain -s space3 -o org2
# target space2
cf target -s space2
# bind app2 to mydrain
cf bind-service app2 mydrain
# target org2 space3
cf target -o org2 -s space3
# bind app3 to mydrain
cf bind-service app3 mydrain |
|
Based on the discussion above I suggest to close this RFC. There is no need for an RFC to improve the documentation :-) |
|
@ZPascal will be great if the outcome of this is to improve the existing docs. |
|
Hi, I did few more tests of the things that I've written previously and found out that I was wrong about service instance sharing and have mixed "normal" platform/CF Market provided services with user-provided-services :-/ The diagram about sharing one/reusing an existing service instance is still valid in a single space. I've tried two more things:
This means that the RFC is still valid. IMO, as the Syslog Drains are only one type of user-provided-services we should definitely not change the behavior for all user-provided-services or even worse, add special handling for Application Syslog Drains. The only way forward with this RFC which I see, is to adjust the whole Syslog Drain creation process via the
The Cloud Controller's syslog drain url controller will be unchanged. The Syslog Agent won't be adjusted as all the heavy lifting will be done by the Cloud Controller. The only possible thing to adjust would be error handling. There are three major challenges with this approach:
Whatever we do, there will be some non-trivial changes in the Cloud Controller needed. We should keep the implementation effort and the complexity in mind and the added value for the app devs. This is a nice to have feature, but I cannot estimate it's possible adoption and usage. Tbh, I understand that having org/space Syslog drains will reduce the management effort for the app devs, but based on the implementation effort, I'm not totally convinced that we need this. What do you think about this? It would be great to hear from someone from the CAPI team. We have one issue about org and space Syslog drains from @Benjamintf1, the former lead of the CF Logging and Metrics Team. I don't know if he had some other ideas about the implementation which he would like to share... |
|
I had a chat with @stephanme today, and we've concluded that We see two possible ways how to proceed with this:
|
|
Hi @chombium, My opinion about the listed options in your last comment is:
|
|
I don't think that there is a blocker making UPS instances sharable or support service keys. It may become more complex to find all bindings that need to be updated when the UPS instance is updated. Bindings to a UPS are updated directly when the credentials of the UPS are updated - in contrast to service bindings to regular service instances where a new binding has to be created to update credentials. I think this would be a good step towards reducing the differences of user-provided services and manage services. We had a similar discussion in the [RFC] Service Credential Binding Rotation for Apps |
|
Thank you all very much for your valuable input and constructive contributions to this discussion. We agree with the suggestion from @stephanme and will move forward with exploring the shareability of User-Provided Services (UPS) as a next step. If needed, we will open a dedicated RFC to address the open questions and implementation details together with the community. Enabling shareability for UPS would also help to generalize and align this functionality more closely with managed services, further reducing the differences between the two. As the main points have been addressed here, we will close this RFC for now. We sincerely apologize for the delayed response and appreciate your patience and engagement throughout this process. Thanks again to everyone who contributed their time and ideas — your input is greatly appreciated! |
Description
This RFC proposes to introduce support for organization- and space-scoped client certificates for Cloud Foundry Loggregator syslog drains using mutual TLS (mTLS), covering both HTTPS and syslog+TLS protocols. By issuing certificates at the org or space level instead of per application, this initiative will simplify certificate lifecycle management, enable centralized rotation, and facilitate integration with central certificate authorities. The change targets the reduction of operational overhead and the enhancement of tenant-level security.
Involved Working Groups:
@cloudfoundry/toc
@cloudfoundry/wg-app-runtime-platform-logging-and-metrics-approvers
@cloudfoundry/wg-app-runtime-platform-logging-and-metrics-reviewers