-
Notifications
You must be signed in to change notification settings - Fork 20
refactor: remove worker pool #628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: remove worker pool #628
Conversation
82e1390
to
ffc3d86
Compare
populate_span_with_admission_request_data(&admission_review.request); | ||
|
||
let response = acquire_semaphore_and_evaluate( | ||
state, | ||
policy_id, | ||
ValidateRequest::AdmissionRequest(admission_review.request), | ||
RequestOrigin::Validate, | ||
) | ||
.await | ||
.map_err(handle_evaluation_error)?; | ||
|
||
populate_span_with_policy_evaluation_results(&response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can see, the tracing for policies in monitor mode will always have result as "allowed". Which is not desired. The metrics should show the original evaluation result. Otherwise, the users will not get the value of the monitor mode which is see the original results before moving the policy to protect mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this. Still this isn't a regression from this PR, so I'm ok opening an issue and tackling it later. I'm wondering if it is better to:
a. Add a mode
field and set allowed
value as the output, which will be false
if on monitor and rejected.
b. Addmode
, monitor_result
fields, if mode==monitor
then allowed
is always true
and monitor_result
equals the result from the policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should create a new issue for that, and tackle that outside of this PR.
I also prefer to keep the current behavior, but extend the trace to have:
- a new field
more
that states whether the policy is operating inmonitor
orprotect
mode - a new field
raw_result
(or something else) that contains the boolean value of the evaluation result before the monitor mode changes that. We could have this field added to all the policies, regardless of their operating mode, or we could have it added only to the traces emitted by policies operating inmonitor
mode
At the end of the day, I want an operator to be able to run a Jaeger query like: select * from traces where operation_mode = "monitor" and raw_result = false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just some quibbles, looks good.
This is great, many thanks! Great refactor, expanded tests & better logging. Good to make use of maturity of Tokio frameworks (plus I learned about idiomatic Axum usage).
Played with it locally too.
Since there's known users of bare policy-server, we should document in the GH release changelog that we now expect HTTP request with the application/json
content header set, and we return better errors.
e191fc7
to
736b28b
Compare
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…ecific module Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
…ests Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
736b28b
to
f4b4d5a
Compare
populate_span_with_admission_request_data(&admission_review.request); | ||
|
||
let response = acquire_semaphore_and_evaluate( | ||
state, | ||
policy_id, | ||
ValidateRequest::AdmissionRequest(admission_review.request), | ||
RequestOrigin::Validate, | ||
) | ||
.await | ||
.map_err(handle_evaluation_error)?; | ||
|
||
populate_span_with_policy_evaluation_results(&response); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should create a new issue for that, and tackle that outside of this PR.
I also prefer to keep the current behavior, but extend the trace to have:
- a new field
more
that states whether the policy is operating inmonitor
orprotect
mode - a new field
raw_result
(or something else) that contains the boolean value of the evaluation result before the monitor mode changes that. We could have this field added to all the policies, regardless of their operating mode, or we could have it added only to the traces emitted by policies operating inmonitor
mode
At the end of the day, I want an operator to be able to run a Jaeger query like: select * from traces where operation_mode = "monitor" and raw_result = false
Fantastic job @fabriziosestito 👏 I left some comments, I think we're pretty close to merge this PR |
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Signed-off-by: Fabrizio Sestito <fabrizio.sestito@suse.com>
Merging, all the concerns/issues have been addressed! 🥳 |
Description
Fixes: #611
This PR refactors the policy server by removing the worker thread pool.
Also, it changes the web framework from
warp
toaxum
.Removing the worker pool is possible because we can now share the
EvaluationEnvironment
between handlers using axumState
extractor.A semaphore is used to limit the simultaneous evaluations instead of relying on a pool of workers.
By doing this we can remove the complexity of the worker pool bootstrap and the bridge between sync and async world, also it simplifies the evaluation flow since we do not have to rely on channels and async communication to start an evaluation in the handlers.
This PR also takes care of the following:
Test
Existing tests were moved/updated to comply with the new code.
This PR also takes care of the following:
As expected load testing doesn't show major performance improvements, the old and new implementations show briefly the same results.
Manual metrics/tracing was performed.
TODO:
Additional Information
Since the sigstore crate dependes on the blocking
reqwest
feature by using and old version oftough
, I needed to wrap the fulcio and rekor initialization in aspawn_blocking
task.This should go away once we update sigstore-rs, see:
sigstore/sigstore-rs#320
and
policy-server/src/lib.rs
Line 58 in 624ca3b