-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add enhancement proposal for result forwarding #69
Add enhancement proposal for result forwarding #69
Conversation
6b7b55d
to
67e4594
Compare
/retest |
enhancements/results-forwarding.md
Outdated
Required: True | ||
Type: string | ||
|
||
Reference to a server certificate/keypair secret for mutual TLS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confused as per why do we need the server cert? I thought the aggregator was a client only?
Or is this secret meant for the GRPC service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was for the gRPC server, but I'll let @mrogers950 and @JAORMX clarify.
|
||
#### `resultForwarding.provider` | ||
|
||
Required: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly a dumb question, but what does 'Required' mean here? "Does not have a default and must be set"?
Shouldn't grpc
to the local crd-creating service be the default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was even wondering if we should have a grpc-local
provider as the default that would nominally talk over grpc, but not over TCP, but a UNIX socket in some shared volume. This would allow us to get away by not requiring mTLS for the communication betweeen the components..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly a dumb question, but what does 'Required' mean here? "Does not have a default and must be set"?
Shouldn'tgrpc
to the local crd-creating service be the default?
Yeah - I think required means it must be specified by the user.
Maybe we need to go through the various options, and pick the best default for backwards compatibility. A grpc
provider makes a lot of sense for the external storage case, but it requires the end user to know where they want to send results (something they don't necessarily care about today) since they go into etcd, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So of course the downside of having the grpc service exposed over local sockets only is that we don't test the network forwarding and would have more configurations to test..
With the grpc-over-network approach, how would the default CO deployment look like then? I was imagining we'd have one more deployment with the grpc server, a service and then the grpc client would use http://service
as the endpoint (and since we'd know the service name, everything could just use defaults..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrogers950 do you have an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's true that we only need to forward the objects that the aggregator is responsible for (results and remediations, see my comment above), my thought is that our "local" mode would just be the existing CR creation and always on by default, and if resultForwarding.provider is set, there is also an attempt to forward. So resultForwarding.Provider would be optional, and we provide another option resultForwarding.DisableLocal that is false by default, and only takes effect when set to true and also provider == set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the topic of the local gRPC traffic, since this is only results, I think that making a local
provider and all would be overkill (requires the aggregator to break up into a client/server). For the evidence proposal that type of pattern makes more sense for what we want to accomplish there, since that's where our existing component client+servers are involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, I guess your proposal would also be safer (more stable) in the sense that the aggregator wouldn't change that drastically from a self-contained binary to a client/server.
Just FWIW, the reason I proposed the always-forwarder was just that it felt architecturally (even if local) much cleaner where the aggregator would be just a forwarder to a handler, because we're have to implement the forwarding either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to moving towards that if it makes sense for the rest of the forwarding features. We could still build the server implementation but keep it as a test component to run the aggregator client against.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds like a good compromise. We need a testbed anyway and having a test server that produces the CRDs would also make it easier to do e2e testing.
Good idea!
67e4594
to
d586a0f
Compare
enhancements/results-forwarding.md
Outdated
to a configured implementation. In this case, the **aggregator** will act as a | ||
gRPC client. The **resultserver** then becomes a gRPC server, subject to the | ||
evidence forwarding provider and will be renamed **evidence-persistor** and | ||
responsible for writing results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case - what does the aggregator
currently use to pass results to the resultserver
?
Or are we just saying that the aggregator
will be a gRPC client. An example server implementation would be resultserver
or rhmdnd/compserv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The former.
Currently, the aggregator parses all the results coming as ConfigMaps from the nodes.
It "aggregates" them together, forming groups from the different machine pools. e.g. one result is applicable to a set of nodes, not just one node. This is done also to detect inconsistencies in pools: "One node deviated from the benchmark configuration, it should be fixed."
Finally, it takes that aggregated result and creates the CRDs based on this.
The idea is to change the last step from the model and turn the aggregator into a gRPC client that always forwards. This way, that code becomes more general and we can then focus on writing server implementations: Be it the default one that creates CRDs, or compserv that centralizes the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional details. I attempted to clarify that in the wording.
d586a0f
to
0e08eac
Compare
enhancements/results-forwarding.md
Outdated
to a configured implementation. In this case, the **aggregator** will act as a | ||
gRPC client. The **resultserver** then becomes a gRPC server, subject to the | ||
evidence forwarding provider and will be renamed **evidence-persistor** and | ||
responsible for writing results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The former.
Currently, the aggregator parses all the results coming as ConfigMaps from the nodes.
It "aggregates" them together, forming groups from the different machine pools. e.g. one result is applicable to a set of nodes, not just one node. This is done also to detect inconsistencies in pools: "One node deviated from the benchmark configuration, it should be fixed."
Finally, it takes that aggregated result and creates the CRDs based on this.
The idea is to change the last step from the model and turn the aggregator into a gRPC client that always forwards. This way, that code becomes more general and we can then focus on writing server implementations: Be it the default one that creates CRDs, or compserv that centralizes the results.
|
||
#### `resultForwarding.provider` | ||
|
||
Required: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say always requiring the gRPC traffic to go through the network would help in keeping things consistent. e.g. you wouldn't need a code-path to handle creating a unix socket and having a separate goroutine in the aggregator to handle this.
|
||
#### `resultForwarding.provider` | ||
|
||
Required: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree that we should choose a sane default that's backwards compatible, that is what users expect and would help in migration.
enhancements/results-forwarding.md
Outdated
to a configured implementation. In this case, the **aggregator** will act as a | ||
gRPC client. The **resultserver** then becomes a gRPC server, subject to the | ||
evidence forwarding provider and will be renamed **evidence-persistor** and | ||
responsible for writing results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional details. I attempted to clarify that in the wording.
|
||
#### `resultForwarding.provider` | ||
|
||
Required: True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like this should still be required, then?
Should we also have a field for marking a default value?
0e08eac
to
6f6c110
Compare
6f6c110
to
b0822ea
Compare
This commit writes down the proposal for implementing a results forwarding mechanism in the compliance operator. Co-Authored-By: Juan Antonio Osorio juan.osoriorobles@eu.equinix.com
b0822ea
to
06ab0c6
Compare
On Thu, Sep 01, 2022 at 12:11:47PM -0700, Lance Bragstad wrote:
@rhmdnd commented on this pull request.
> +Required: True if using `token` authentication type
+Type: string
+
+#### `resultForwarding.grpc.authentication.token.tokenSecretName`
+
+Required: False
+Type: string
+
+Reference to the secret containing the token.
+
+#### `resultForwarding.grpc.extraMetadata.clusterName`
+
+Required: True
+Type: string
+
+The name of the cluster to use in result payloads.
Ok - I'll pull this out of the per-scan configuration then and document it else where.
Who would be responsible for setting up this config map?
The way I understood the workflow was:
- the admin could set up the config map themselves manually
- if we can infer the cluster name, CO should set this config map
automatically. This would be the default on OCP.
- (unsure about this part) if the CM is not set and the
autodetection is not possible, what then? A random string (that
would have to be stable across re-runs)? Or just a hard fail?
|
I like the idea of having a random string. It's just ideal to have a way to detect a unique deployment. |
Ok, let me summarize to make sure I understand where we are: The Compliance Operator will check for a Administrators can modify this value if they choose to (e.g, We could recommend the users populate this Is that correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
imo we should just continue iterating on the doc as we implement the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great - one small educational question. I like this proposal, seems like it would be very helpful.
Requires a network connection to forward results after each scan. The | ||
Compliance Operator should validate the endpoint URL and fail early if it is | ||
malformed. If the Compliance Operator cannot connect to the gRPC endpoint, it | ||
should retry and issue an alert. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to answer the question posed on line 436, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I think we could supply a configuration, through documentation, that tells users how to configure the compliance-operator to forward results and how to store them locally.
The safety net in that case would be that the results are saved somewhere in a persistent volume if the gRPC forwarding fails.
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jhrozek, rhmdnd, sheriff-rh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I'm on board with updating the document as we implement the API and configuration. @JAORMX @mrogers950 @Vincent056 thoughts? |
/label qe-approved |
This commit writes down the proposal for implementing a results
forwarding mechanism in the compliance operator.
Co-Authored-By: Juan Antonio Osorio juan.osoriorobles@eu.equinix.com