Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-984: Add support for custom CA bundle for reencrypt termination type routes #998

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bharath-b-rh
Copy link
Contributor

PR has the changes for enhancement proposed to support custom CA bundle to be used by the router to verify the server's certificate for the reencrypt termination type when the destinationCA is not configured.

Below functionalities are added:

  • New ingress-ca-bundle configmap will be created by a new controller cabundleconfigmap of the operator, which contains CA certificates bundle containing the certificates in service-ca-bundle configmap and user created admin-ca-bundle configmap.
  • Operator watches ingress-ca-bundle, service-ca-bundle and admin-ca-bundle configmaps for any modifications and updates ingress-ca-bundle to desired state.
  • Operator makes ingress-ca-bundle available in operand router as a file at /var/run/configmaps/ca-trust/ca-bundle.crt and sets same path to DEFAULT_DESTINATION_CA_PATH environment variable used by operand router.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 17, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 17, 2023

@bharath-b-rh: This pull request references CFE-984 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

PR has the changes for enhancement proposed to support custom CA bundle to be used by the router to verify the server's certificate for the reencrypt termination type when the destinationCA is not configured.

Below functionalities are added:

  • New ingress-ca-bundle configmap will be created by a new controller cabundleconfigmap of the operator, which contains CA certificates bundle containing the certificates in service-ca-bundle configmap and user created admin-ca-bundle configmap.
  • Operator watches ingress-ca-bundle, service-ca-bundle and admin-ca-bundle configmaps for any modifications and updates ingress-ca-bundle to desired state.
  • Operator makes ingress-ca-bundle available in operand router as a file at /var/run/configmaps/ca-trust/ca-bundle.crt and sets same path to DEFAULT_DESTINATION_CA_PATH environment variable used by operand router.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot requested review from alebedev87 and gcs278 November 17, 2023 07:24
@bharath-b-rh
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 17, 2023

@bharath-b-rh: This pull request references CFE-984 which is a valid jira issue.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bharath-b-rh
Copy link
Contributor Author

/hold openshift/enhancements#1514

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 17, 2023
@lihongan
Copy link
Contributor

cc @ShudiLi

@bharath-b-rh
Copy link
Contributor Author

/retest

1 similar comment
@bharath-b-rh
Copy link
Contributor Author

/retest

@bharath-b-rh
Copy link
Contributor Author

/test e2e-gcp-operator

@bharath-b-rh
Copy link
Contributor Author

/test e2e-gcp-operator

@bharath-b-rh
Copy link
Contributor Author

/test e2e-aws-ovn-single-node

@bharath-b-rh
Copy link
Contributor Author

/cc @Miciah

@openshift-ci openshift-ci bot requested a review from Miciah November 27, 2023 15:10
@ShudiLi
Copy link
Member

ShudiLi commented Nov 28, 2023

Tested it with 4.15.0-0.test-2023-11-28-002931-ci-ln-z8rzn72-latest
`
1.
% oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.15.0-0.test-2023-11-28-002931-ci-ln-z8rzn72-latest True False 113m Cluster version is 4.15.0-0.test-2023-11-28-002931-ci-ln-z8rzn72-latest

% oc -n openshift-ingress get deployment router-int1 -oyaml | grep "var/run/configmaps"
value: /var/run/configmaps/ca-trust/ca-bundle.crt
- mountPath: /var/run/configmaps/ca-trust
`

% oc -n openshift-ingress get deployment router-int1 -oyaml | grep -n "name: ingress"
176: name: ingress-ca-bundle
215: name: ingress-ca-bundle
217: name: ingress-ca-bundle
%

%oc -n openshift-config create configmap admin-ca-bundle --from-file=ca-bundle.crt=two-servers.pem

% oc -n openshift-ingress rsh router-int1-86c8bf4447-8475m
sh-4.4$ env | grep -i dest
DEFAULT_DESTINATION_CA_PATH=/var/run/configmaps/ca-trust/ca-bundle.crt
sh-4.4$ cat /var/run/configmaps/ca-trust/ca-bundle.crt
-----BEGIN CERTIFICATE-----
MIIDUTCCAjmgAwIBAgIIaBqBoB4eWkcwDQYJKoZIhvcNAQELBQAwNjE0MDIGA1UE
Awwrb3BlbnNoaWZ0LXNlcnZpY2Utc2VydmluZy1zaWduZXJAMTcwMTEzMzg1NjAe
Fw0yMzExMjgwMTEwNTZaFw0yNjAxMjYwMTEwNTdaMDYxNDAyBgNVBAMMK29wZW5z
aGlmdC1zZXJ2aWNlLXNlcnZpbmctc2lnbmVyQDE3MDExMzM4NTYwggEiMA0GCSqG
SIb3DQEBAQUAA4IBDwAwggEKAoIBAQDHJIzrDwikWgCR9t+Db1/XIgxAmZej+xX9
fqCdg25AvurBQGBf6H0FmuTd4KPXofyJIn2StTTSl3Mn0bU4w6mocZOshB8r3yw8
NS3SZr6EttHFFl/qkUCU9R8+IHtrGoERrdGewbT+MVTfnprWqyFf/hNW1HBpd9Tl
RDSCSWGb+sydmByhMdes9ZCYMLrBIPn5kfDTNVsQmfyb2f3LQib4spAbNA5VN2gE
Y2hnKkAcH9k8pUKRnfHLWFHvjwerht4aYMVIhsFw2MNzmCs4P1wl8bJe8bogfvQb
yga9LbNhB0E0x5ry6ve8tRi/wbKd7wD8cyoqNIZz2/B53xTuEj8lAgMBAAGjYzBh
MA4GA1UdDwEB/wQEAwICpDAPBgNVHRMBAf8EBTADAQH/MB0GA1UdDgQWBBRWgpRk
LRr8h+01ubDTGuGw1uOQ+zAfBgNVHSMEGDAWgBRWgpRkLRr8h+01ubDTGuGw1uOQ
+zANBgkqhkiG9w0BAQsFAAOCAQEAwwfG/ahwwBLMTvTbYDQfiMsns+Pm2u4zMxbD
mYKAGBHFL5Ftp/A/JuN1lSzVHyG8nebV0yaG9zfu4NFpFzZAAH4Gws1NwYqMLhfI
afSiCJijGgh6OcxUixR9sQzo/XrSDVKjqvF8xLBPaQPU+5RblWY7BVBtUHsjx+ho
PJFm4U8ID5gAdEmO/xDybs3OjiAw+b7vMSM4vtgYSGah2IfXVQ1+O7aIBMPJLKup
BQH0nOrsoVOQ5ISX2i0GxHvz7HVLruIBV2mVnNeD9X3uNr6etfzrMU2IxAf+1E4Q
r0AXh0n8+UvpaMB7DKW+qgIEs7uSXOaZVuDbHv7lRfAuCo+l2Q==
-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----
MIIFfjCCA2agAwIBAgIUIju9/9m5se5V6/25Ks84evGrnK8wDQYJKoZIhvcNAQEL
BQAwYTELMAkGA1UEBhMCWFgxFTATBgNVBAcMDERlZmF1bHQgQ2l0eTEcMBoGA1UE
CgwTRGVmYXVsdCBDb21wYW55IEx0ZDEdMBsGA1UEAwwUd3d3LnVzZXItZXhhbXBs
ZS5jb20wHhcNMjMxMTIwMTExMDE0WhcNMzMxMTE3MTExMDE0WjBhMQswCQYDVQQG
EwJYWDEVMBMGA1UEBwwMRGVmYXVsdCBDaXR5MRwwGgYDVQQKDBNEZWZhdWx0IENv
bXBhbnkgTHRkMR0wGwYDVQQDDBR3d3cudXNlci1leGFtcGxlLmNvbTCCAiIwDQYJ
KoZIhvcNAQEBBQADggIPADCCAgoCggIBAOuvPX52hg+gjCRVmB6H4b03tTcxLaSI
JtxjV+Nb3IKIKaKWOTlp/0r1mCy14D76xwhUGTJXu8EUwLnqXxthm2Pk7HYjxQvy
mx3X6nUZu96ZQSeA1LCopUHFIQ+VVqUIwUpQuof5r7iMWDtpt0kbe7tB1bcrR7I0
qMJK6yCuQC0YlsSXWWjdHbnTXX/EZ/gOkROYGXq1387BSVTR8unNS6iGog4msVSk
g3FahqSgTqNCvaqKasHHfR5h5KoSXy2BoKBp9kqQPRwFYgS8ZD8DumynkOeMW1D3
7De6EcKkxN7MSsO4TnILbnwjeN3xfsQ+C7uyh6Q3dwlsTHYmDm20X9oazzZpmM9U
t+StTyny0tREPlfHoh6f2yht2ZEcoL0ehxVIrOwaNDVOn8upNnnHTpBNV2RQFZuS
KPILX5XL2rpdX1q9UbnghoMj+OnnSsiJRBQQMUdqbHS4IoN7fP/njE13aI7lNSv6
XFdFTU+2bBKxR8CJm/Urkj8rxV/9bZX3O3jPkg3FBm2Wy8JYW/4LPHFg6cNb3muL
zPH5NyXoAftPbmI41yg4NE0WVDS3W/MymqrOz6PM8WpDqLdgBFtGNRtbLoeg6p2F
o2gTpFJo9xgAXN3DeJVRLOAJudNnd5dXS9BNQRGtbf1NRLNagv+sx5JcvQXpfzlt
htwKItdqh47hAgMBAAGjLjAsMAsGA1UdDwQEAwIF4DAdBgNVHREEFjAUghIqLnVz
ZXItZXhhbXBsZS5jb20wDQYJKoZIhvcNAQELBQADggIBAGZLph2kLifFX+G2TFpg
8AjgHGL6GPe0h/X5ubb9sThMHZVLkuxNYYTNtoWmaL9ARBh98DtvqsFpAp9kRzuO
rNiocOZ2YLYezS/C3wFNNZGHJ4oteA7gW9KGDfplXm/svHM8TLaQ1hcXGK2YxLzM
vw1n6UmRr8JyYqwfGteMdP//ELaGR9qBRqAq8g8DeljWXWiLuunObjYXczC0Ypeb
IU9TJX4K4LjfXm+M5cIjmb4YSJG1UDplwJgxZQVyxUWBXuSxS8g0E2NYXGR8ngPh
ao+zYzpzblQHZ60NJgracHZnUjHANVBvqqDsU3CNzqjGpw+8G9eCxfgOhiQHBZPd
u44AqbTFtOrT6gY4bz2SMloqxOhG8HJoubxOF/EnIPFCSCEHUHRdlhE2+KGpNrwW
QT2SAM+VpF28DIIszdUyL7Fns53dMN0WHvDclVXz0/I9OSpHYOlM6V6Lg3Mx8ChP
MT5jSEMvX/vSoXNW0jrxo+P61qBuaBTcgvkT0ZMy/j7O/VWe4w5f7lcXWL8P2KVk
RaNzkyQR4VLP/DuHruHbR7eJ1AKzxKd0XSDNEoq+VwMUZ2PaykGvHDaB+0qpvmLy
V+6Uu/53jtjs5NNJIzzipm1yWRchjE/3P3kPQ22o4JUU1xeWTEfDH+kXgKS3+4Lq
EwXw1Q4BE/X2DKJWHr9kfoOc
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIFCzCCAvOgAwIBAgIJANBQMR0sicDLMA0GCSqGSIb3DQEBCwUAMCAxHjAcBgNV
BAMMFXd3dy51c2VyMi1leGFtcGxlLmNvbTAeFw0yMzExMjgwMjM4MjFaFw0zMzEx
MjUwMjM4MjFaMCAxHjAcBgNVBAMMFXd3dy51c2VyMi1leGFtcGxlLmNvbTCCAiIw
DQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBALdWKEofenutye6CGT5K+KzPYjlT
9L6ELwIIRWnHY8S7CkzoCMt+Ow+v1CbnpddHsH+GotM1lnXDFX6AjwhEgvQgNM6L
z614cnjorJxb0gTTtYGiZpNINNgEWgnCAH+ba4FfIl293VIJ6idCPr4U30MNX2I3
e40fW5+4dAFHykROY5HBlqVwUSTp9Jm0y72BuqryEH3e3ZH8K6kEfOMaPfZ8gG5D
OuCgXJvza00TGAcG+tE8YuEVo/jBo8xNLPzLH6O+dN0Fgp+QoESkGdc1PlTIIGF+
nRQR5WYigAGvh5aNTkYShEE0nhwRxxT7E2wRIEcNotjQAFGRBzVp32AuMS6350gB
D9/FziR0KxKiqLwHTcGUI64mzfrfdPekqVfy2PdEIj5LT0FkiVJF3h4BFzW1oj50
YzZKwLJ0NDp2dSIaKYDW+H6lZKc+dT+p/q6ZF9QAms7elVqhlWNAiVbNLRteLdyL
HwchOyKY+zbwP8k0yWMidWG1RrUJf5GhQU9diA/IjAQhBtoZ2126+CPfs5yTCgTs
824uV0H0G9Ss3aT5pf3fKgUM6g0eJ5JlIlhFTKeh9NDrjHHCpqGHDc97atj6p9+L
hvI0nINudZ1UlJp6MzbuXpIb8OmwbV1y7fMny8bRhLildJ9RZpWTFRPcSSe1zoMb
PaUYUwBHEbYuDMKbAgMBAAGjSDBGMAwGA1UdEwQFMAMBAf8wNgYDVR0RBC8wLYIT
Ki51c2VyMi1leGFtcGxlLmNvbYIWc2VjLWFwYWNoMi5kZWZhdWx0LnN2YzANBgkq
hkiG9w0BAQsFAAOCAgEAnM1Y+56prgs0krD8vnfPqC9Bjv7eZIkww2JEB14jPqpP
SQUaW2vH0lZn/zX3qhZHGJwtKaXHP6jTooNxRXtawmmrLO5vOXiEcVWgHl6DXzVj
Ij/TXN2M7Fy5O/CaSDjSnZAsQy62hZqe/f/W9oATvt5jmt8OQia+rcsoSMip3Ekj
bClYHPf0LCfSc+xL6Q5y+xKQo2gLIFISp6io/lqsv4GPZ3BR1MTss2oFuYG64yQH
Km+KMUzzXjRApXFhPu4i9SWJKjgHpRd3HcJ1WSt7Ccm8GBpfQK+fpExDIwmyvjcl
KlH5PRtV462CRdF9PzBT/InNoFkZ50kjfYxt/FpNqBm6nWO6bUGK71prds+tsxil
psjOTBrT681IsveQIDsybevyT7flu5TZoKIhC/c2vxoXvNzK7mmnE+X0LIPPbKAX
AvQE3S5hDDA+DePwj+7U35q3AnGymoHQhu+c4tiYok/ZYpPD35BVW9NRuBSGZwBN
yh2i+rnf6iE9Q022FExNVElQPdwRSa2AlDFDPZAQbQJzZqaus45N1jiGUCCU4enB
ojcWwmdoIt5R7ddKJEKS+fVWfoUpEY1Ryw/SgeQTwAoy4BB6TujxGlYRsJ5rQlqq
CdURB9oEwSzpHYUKdpxzT41fPjlwuj/n9UQ6Y3XZVqGs2nz1mVzCOzqkaiLPy/Q=
-----END CERTIFICATE-----
sh-4.4$

% oc create route reencrypt myreen2 --service=sec-apach2 --hostname=reen222.int1.shudi-415g28.qe.gcp.devcluster.openshift.com
route.route.openshift.io/myreen2 created
% oc create route reencrypt reenwithdstca --service=service-secure --dest-ca-cert=ca.pem --hostname=reenwithdstca.int1.shudi-415g28.qe.gcp.devcluster.openshift.com
route.route.openshift.io/reenwithdstca created

% oc get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
myreen2 reen222.int1.shudi-415g28.qe.gcp.devcluster.openshift.com sec-apach2 sec-apach2 reencrypt None
reenwithdstca reenwithdstca.int1.shudi-415g28.qe.gcp.devcluster.openshift.com service-secure https reencrypt None

sh-4.4# curl https://reen222.int1.shudi-415g28.qe.gcp.devcluster.openshift.com -k
this a test!
sh-4.4#

sh-4.4# curl https://reenwithdstca.int1.shudi-415g28.qe.gcp.devcluster.openshift.com -k
Hello-OpenShift-1 https-8443
sh-4.4#

@ShudiLi
Copy link
Member

ShudiLi commented Nov 28, 2023

/label qe-approved
thanks

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Nov 28, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 28, 2023

@bharath-b-rh: This pull request references CFE-984 which is a valid jira issue.

In response to this:

PR has the changes for enhancement proposed to support custom CA bundle to be used by the router to verify the server's certificate for the reencrypt termination type when the destinationCA is not configured.

Below functionalities are added:

  • New ingress-ca-bundle configmap will be created by a new controller cabundleconfigmap of the operator, which contains CA certificates bundle containing the certificates in service-ca-bundle configmap and user created admin-ca-bundle configmap.
  • Operator watches ingress-ca-bundle, service-ca-bundle and admin-ca-bundle configmaps for any modifications and updates ingress-ca-bundle to desired state.
  • Operator makes ingress-ca-bundle available in operand router as a file at /var/run/configmaps/ca-trust/ca-bundle.crt and sets same path to DEFAULT_DESTINATION_CA_PATH environment variable used by operand router.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bharath-b-rh
Copy link
Contributor Author

Thank you @ShudiLi!

@snarayan-redhat
Copy link

/label docs-approved

@openshift-ci openshift-ci bot added the docs-approved Signifies that Docs has signed off on this PR label Nov 28, 2023
@candita
Copy link
Contributor

candita commented Dec 6, 2023

/assign @gcs278
/assign @Miciah

Copy link
Contributor

@gcs278 gcs278 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I just took a quick review. I plan to continue to review more later.

pkg/operator/controller/cabundle-configmap/controller.go Outdated Show resolved Hide resolved
pkg/operator/operator.go Outdated Show resolved Hide resolved
pkg/operator/controller/cabundle-configmap/controller.go Outdated Show resolved Hide resolved
Copy link
Contributor

openshift-ci bot commented Dec 7, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from gcs278. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@gcs278 gcs278 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice unit and E2E tests, thanks for doing that. I think it's looking great, just some nit picks and questions.

if curTime.Sub(cert.NotBefore).Hours() < 0 {
log.Info("certificate not yet valid, but will be considered", "certificate", certID, "valid after", cert.NotBefore.String(), "certificate bundle", caCertBundleName)
}
if curTime.Sub(cert.NotAfter).Hours() > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe an expired CA is useless to HaProxy and will cause SSL errors, but have you thought about the scenario in which the CA expires while the router is running (most likely the scenario)? Then, the next time a cluster-admin does a rollout of the router, the CA will be removed.

I don't think this is problematic, but I wonder if it may cause "churn" or confusion for a cluster admin because it's possible the SSL errors of an expired CA are different than having no CA at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Router returns the common error, but yes, if the HAProxy debug logs are enabled a user can find out the exact SSL error, and it would create confusion, but I think we could cover this in docs? or should we just log the error and continue to keep the CA certificate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like it might be safer to err on the side of leaving the cert / logging the error and keeping the HaProxy failure mode of an expired CA Certificate. As long as the expired CA cert doesn't crash HaProxy (I'm pretty confident it doesn't), then I'm not totally sure what the net advantage is of skipping/removing it. HaProxy should handle expired CA's in a reasonable way.

Copy link
Contributor

@Miciah Miciah Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If HAProxy does not error out with the expired CA certificate, we should keep it. The cluster-admin expressed intent to use the CA certificate, so we should use it if possible (that is, if using it doesn't break ingress by causing HAProxy to refuse to start). However, it wouldn't be a bad idea to warn the cluster-admin about the expired CA certificate in an alert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion incorporated. Just the certificates with insecure signature algorithm and non-ca certificates will be skipped now.

test/e2e/user_configured_cabundle_test.go Outdated Show resolved Hide resolved
test/e2e/user_configured_cabundle_test.go Outdated Show resolved Hide resolved
test/e2e/user_configured_cabundle_test.go Show resolved Hide resolved
func buildServerDeployment(name types.NamespacedName, labels map[string]string, port int32, secretName, caConfigMapName string) *appsv1.Deployment {
const (
containerName = "hello-world-server"
imageName = "quay.io/bharath-b-rh/hello-world:latest"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to try to use a standard image, and not one in a personal repo that could get deleted, or changed.

Some examples of images our existing tests use:

  • image-registry.openshift-image-registry.svc:5000/openshift/tools:latest
  • The router deployment's: deployment.Spec.Template.Spec.Containers[0].Image
  • quay.io/centos7/httpd-24-centos7 (this one I'm a little hesitant, but at least it's in a public repo)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can work on making source code public, and add more description to image in repo.

For reencrypt, since we need to verify service certificates, we need an https server and I was looking for alternatives. One I came across was apache httpd, but this requires enabling certain configurations for SSL handling, which would require building own image with changes or handling it through entrypoint script.
Can we use https://hub.docker.com/r/mendhak/http-https-echo/tags?
Please suggest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should use image-registry.openshift-image-registry.svc:5000/openshift/tools:latest or quay.io/centos7/httpd-24-centos7 if we can. Alternatively, if the server code isn't too complex, we can build it into the ingress-operator executable, the way we did the the test servers in https://github.com/openshift/cluster-ingress-operator/tree/master/test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, we cannot use an image from Docker Hub as we could be affected by rate limiting or other issues if we did.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we can use quay.io/centos7/httpd-24-centos7 and add the httpd configuration we need via a configmap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the public image, instead now making use of the ingress-operator image fetched from the deployment resource to use the test http server made available.


const (
retries = 3
retryInterval = time.Minute
Copy link
Contributor

@gcs278 gcs278 Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might just be my speculative opinion, but a minute quite high to sleep for between requests. Maybe 10 or 20 seconds at most?

I also think it doesn't hurt to do more retries, maybe 5 or 10. Since this is going through the internet, we deal with flakes, especially flakey DNS lookup issue within our CI cluster. Might seem overly generous, but flakes are quite time and resource consuming.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially used a minute retry interval thinking of the time taken configmap changes to reflect in pod.
I have updated the retries and interval.

)

var (
insecureCertificateSignatureAlgorithms = map[x509.SignatureAlgorithm]string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a godoc comment to explain why these algorithms are considered insecure, and why we need this map? Something like the following (but double-check what I wrote for factual accuracy!):

Suggested change
insecureCertificateSignatureAlgorithms = map[x509.SignatureAlgorithm]string{
// insecureCertificateSignatureAlgorithms is used to warn about and
// filter out certificates that use algorithms that are no longer
// supported by OpenSSL. Configuring the router with a certificate that
// used one of these algorithms would cause HAProxy to refuse to start.
insecureCertificateSignatureAlgorithms = map[x509.SignatureAlgorithm]string{

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion incorporated.

Comment on lines 137 to 154
// currentIngressCABundleConfigMap returns the current configmap. Returns
// if configmap exists, and an error value when an error is encountered.
func (r *reconciler) currentAdminCABundleConfigMap(ctx context.Context) (*corev1.ConfigMap, error) {
cm, err := r.fetchCABundleConfigMap(ctx, r.config.AdminCAConfigMapName)
if err != nil {
if errors.IsNotFound(err) {
return nil, nil
}
return nil, err
}
return cm, nil
}

// currentIngressCABundleConfigMap returns the current configmap. Returns
// if configmap exists, and an error value when an error is encountered.
func (r *reconciler) currentServiceCABundleConfigMap(ctx context.Context) (*corev1.ConfigMap, error) {
cm, err := r.fetchCABundleConfigMap(ctx, r.config.ServiceCAConfigMapName)
if err != nil {
return nil, err
}
return cm, nil
}

// currentIngressCABundleConfigMap returns the current configmap. Returns
// if configmap exists, and an error value when an error is encountered.
func (r *reconciler) currentIngressCABundleConfigMap(ctx context.Context) (bool, *corev1.ConfigMap, error) {
cm, err := r.fetchCABundleConfigMap(ctx, r.config.IngressCAConfigMapName)
if err != nil {
if errors.IsNotFound(err) {
return false, nil, nil
}
return false, nil, err
}
return true, cm, nil
}

// fetchCABundleConfigMap fetches a configmap. Returns if
// configmap exists, and an error value when an error is encountered.
func (r *reconciler) fetchCABundleConfigMap(ctx context.Context, name types.NamespacedName) (*corev1.ConfigMap, error) {
cm := &corev1.ConfigMap{}
err := r.client.Get(ctx, name, cm)
return cm, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider defining a single currrentConfigMap function:

// currentConfigMap returns the current named configmap.  Returns a Boolean
// indicating whether the configmap existed, the configmap if it did exist,
// and an error value.
func (r *reconciler) currentConfigMap(ctx context.Context, name types.NamespacedName) (bool, *corev1.ConfigMap, error) {
	var cm corev1.ConfigMap
	if err := r.client.Get(ctx, name, &cm); err != nil {
		if errors.IsNotFound(err) {
			return false, nil, nil
		}
		return false, nil, err
	}
	return true, &cm, nil
}

Re-using a function avoids code duplication. Having the explicit Boolean return value makes the logic more consistent and forces you to do explicit if !haveFooConfigmap checks, which is a convention we try to follow in this operator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion incorporated.

// <Certificate CommonName> which is used in logging and for better identification of
// certificate in the bundle.
func getCertificatePrintId(cert *x509.Certificate) string {
return fmt.Sprintf("%s(SerialNumber)-%s(CommonName)", encodeSerialNumber(*cert.SerialNumber), cert.Subject.CommonName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is SerialNumber guaranteed to be non-nil? I don't see that documented as part of the API contract: https://pkg.go.dev/crypto/x509#Certificate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerialNumber must be a positive integer in conforming certificates according to RFC. It's not documented in API doc, butParseCertificate API reads the serial number from the certificate and when not present will have the empty value of type. Please let me know your thoughts, should I remove logic around serial number.

Comment on lines 282 to 284
if cert.Subject.CommonName == c.commonName &&
bytes.Compare(cert.SubjectKeyId, c.subjectKeyId) == 0 &&
cert.SerialNumber.Cmp(&c.serialNumber) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to have two CA certificates with the same CN if they have different subject key ids or different serial numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, CommonName is just an identifier provided by the user.

Comment on lines 113 to 116
if adminCABundle != nil && adminCABundle.Data != nil {
data, exist := adminCABundle.Data[adminCABundleConfigMapKeyName]
if !exist {
return false, nil, fmt.Errorf("%s is invalid, must contain \"%s\" key with required CA certificates", adminCABundle.Name, adminCABundleConfigMapKeyName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it an error if adminCABundle.Data[adminCABundleConfigMapKeyName] is missing but not if adminCABundle.Data is nil?

Would it be safer to log that adminCABundle is invalid but continue on to create the ingress CA bundle configmap with the just service CA?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it an error if adminCABundle.Data[adminCABundleConfigMapKeyName] is missing but not if adminCABundle.Data is nil?

A user could configure an empty configmap, but if it's not empty and the key name is misspelt, I thought that should be logged.

Would it be safer to log that adminCABundle is invalid but continue on to create the ingress CA bundle configmap with the just service CA?

Yeah, I have updated the code, to create events when adminCABundle has errors.

@@ -483,7 +483,7 @@ func TestDesiredRouterDeploymentSpecTemplate(t *testing.T) {
if volume.Secret.SecretName != secretName {
t.Errorf("router Deployment expected volume %s to have secret %s, got %s", volume.Name, secretName, volume.Secret.SecretName)
}
} else if volume.Name != "service-ca-bundle" {
} else if volume.Name != controller.IngressCAConfigMapName().Name {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for the changes in this file? For test code, I think it is appropriate to use explicit string literals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume name is changed to ingress-ca-bundle by this implementation and while updating the same thought to use the definition and avoid hard coding. Please let me know, if it needs to be reverted.

test/e2e/user_configured_cabundle_test.go Outdated Show resolved Hide resolved
test/e2e/user_configured_cabundle_test.go Outdated Show resolved Hide resolved
Comment on lines 210 to 236
if err = wait.PollUntilContextTimeout(ctx, 2*time.Second, 3*time.Minute, true, func(ctx context.Context) (bool, error) {
if err := kclient.Get(ctx, svcDeploymentName1, svcDeployment1); err != nil {
t.Logf("failed to get server deployment %q: %v", svcDeploymentName1, err)
return false, nil
}
for _, cond := range svcDeployment1.Status.Conditions {
if cond.Type == appsv1.DeploymentAvailable {
return cond.Status == corev1.ConditionTrue, nil
}
}
return false, nil
}); err != nil {
t.Fatalf("timed out waiting for deployment %q to become ready: %v", svcDeploymentName1, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a helper function, like waitForDeploymentComplete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion incorporated.

@bharath-b-rh bharath-b-rh force-pushed the cfe-984 branch 2 times, most recently from 13a8229 to b7046f6 Compare January 23, 2024 11:32
@bharath-b-rh
Copy link
Contributor Author

/hold openshift/router#537
e2e cases added here is dependent on openshift/router#537 changes and requires it to be merged first.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 24, 2024
@bharath-b-rh
Copy link
Contributor Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 24, 2024
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 24, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 24, 2024
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@bharath-b-rh bharath-b-rh marked this pull request as draft August 7, 2024 13:47
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 7, 2024
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 7, 2024
@lihongan
Copy link
Contributor

lihongan commented Sep 9, 2024

/remove-lifecycle rotten

@openshift-ci openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Sep 9, 2024
@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 8, 2024
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 8, 2025
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@bharath-b-rh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift aa18a46 link true /test e2e-hypershift
ci/prow/e2e-azure-operator aa18a46 link true /test e2e-azure-operator
ci/prow/e2e-gcp-ovn aa18a46 link false /test e2e-gcp-ovn
ci/prow/e2e-aws-ovn-single-node aa18a46 link false /test e2e-aws-ovn-single-node
ci/prow/e2e-gcp-operator aa18a46 link true /test e2e-gcp-operator
ci/prow/hypershift-e2e-aks aa18a46 link true /test hypershift-e2e-aks

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@gcs278
Copy link
Contributor

gcs278 commented Jan 22, 2025

I won't be able to review this due to other commitments.
/unassign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. docs-approved Signifies that Docs has signed off on this PR jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants