Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security Solution] Swap rule unions out for discriminated unions to improve validation error messages #171452

Merged
merged 14 commits into from
Nov 29, 2023

Conversation

marshallmain
Copy link
Contributor

@marshallmain marshallmain commented Nov 16, 2023

Epics: https://github.com/elastic/security-team/issues/8058, https://github.com/elastic/security-team/issues/6726 (internal)
Partially addresses: https://github.com/elastic/security-team/issues/7991 (internal)

Summary

The main benefit of this PR is shown in rule_request_schema.test.ts, where the error messages are now more accurate and concise. With regular unions, zod has to try validating the input against all schemas in the union and reports the errors from every schema in the union. Switching to discriminated unions, with type as the discriminator, allows zod to pick the right rule type schema from the union and only validate against that rule type. This means the error message reports that either the discriminator is invalid, in any case where type is not valid, or if type is valid but another field is wrong for that type of rule then the error message is the validation result from only that rule type.

To make it possible to use discriminated unions, we need to switch from using zod's .and() for intersections to .merge() because .and() returns an intersection type that is incompatible with discriminated unions in zod. Similarly, we need to remove the usage of .transform() because it returns a ZodEffect that is incompatible with .merge().

Instead of using .transform() to turn properties from optional to possibly undefined, we can use requiredOptional explicitly in specific places to convert the types. Similarly, the RequiredOptional type can be used on the return type of conversion functions between API and internal schemas to enforce that all properties are explicitly specified in the conversion.

Future work:

  • better alignment of codegen with OpenAPI definitions of anyOf/oneOf. https://swagger.io/docs/specification/data-models/oneof-anyof-allof-not/#oneof oneOf requires that the input match exactly one schema from the list, which is different from z.union. anyOf should be z.union, oneOf should be z.discriminatedUnion
  • flatten the schema structure further to avoid Type instantiation is excessively deep and possibly infinite. Seems to be a common issue with zod (“Type instantiation is excessively deep and possibly infinite” but only in a large codebase microsoft/TypeScript#34933) Limiting the number of .merge and other zod operations needed to build a particular schema object seems to help resolve the error. Combining ResponseRequiredFields and ResponseOptionalFields into a single object rather than merging them solved the immediate problem. However, we may still be near the depth limit. Changing RuleResponse as seen below also solved the problem in testing, and may give us more headroom for future changes if we apply techniques like this here and in other places. The difference here is that SharedResponseProps is only intersected with the type specific schemas after they're combined in a discriminated union, whereas in main we merge SharedResponseProps with each individual schema then merge them all together.
  • combine other Required and Optional schemas, like QueryRuleRequiredFields and QueryRuleOptionalFields
export type RuleResponse = z.infer<typeof RuleResponse>;
export const RuleResponse = SharedResponseProps.and(z.discriminatedUnion('type', [
  EqlRuleResponseFields,
  QueryRuleResponseFields,
  SavedQueryRuleResponseFields,
  ThresholdRuleResponseFields,
  ThreatMatchRuleResponseFields,
  MachineLearningRuleResponseFields,
  NewTermsRuleResponseFields,
  EsqlRuleResponseFields,
]));

@marshallmain marshallmain marked this pull request as ready for review November 20, 2023 15:33
@marshallmain marshallmain requested review from a team as code owners November 20, 2023 15:33
@marshallmain marshallmain added release_note:skip Skip the PR/issue when compiling release notes v8.12.0 Team:Detection Rule Management Security Detection Rule Management Team labels Nov 20, 2023
Copy link
Contributor

@rylnd rylnd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice writeup; much obliged. Detection engine changes are minimal; mostly just the moved requiredOptional calls and all the test updates (which look great). Nice discriminators, as well 😉 .

LGTM

@banderror banderror added Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. labels Nov 22, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-detections-response (Team:Detections and Resp)

@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Copy link
Contributor

@maximpn maximpn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marshallmain thank you for improving error messages 🙏

Error messages look much better now 👍 And I'm really happy you got rid of x-modify: requireOptional.

We definitely need to keep an eye on our schemas complexity to avoid Type instantiation is excessively deep and possibly infinite. Zod is well know for complex typing and one of the slowest libraries for type inference due to huge flexibility. Interesting if TS can be tuned too to allow "deeper" types.

The only concern I have is related to params.meta as { [x: string]: {} | undefined }. Basically recursive RequiredOptional changes type from Record<string, unknown> to Record<string, {} | undefined>. It looks like it should be some solution like define SnakeCase types explicitly and define modified types there. Or refrain from recursive RequiredOptional.

(Btw there is an util type to transform camel case to snake case.)

@@ -49,7 +49,6 @@ components:
required:
- id
- query
x-modify: requiredOptional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for finally removing requiredOptional from the schema 👍

@@ -28,7 +28,13 @@
*/
export type RequiredOptional<T> = { [K in keyof T]-?: [T[K]] } extends infer U
? U extends Record<keyof U, [unknown]>
? { [K in keyof U]: U[K][0] }
? {
[K in keyof U]: Record<string, unknown> extends U[K][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to make RequiredOptional recursive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was experimenting with this during the hackathon to make it easier to ensure that as we add more object types to the rule schema no fields get missed during the conversion between snake and camel case. Alert suppression fields, for example, are in the alert_suppression object and some of those sub-fields are optional. We can use the non-recursive RequiredOptional on the return type of convertAlertSuppressionToSnake to cover that conversion. But, it could be useful to have a recursive RequiredOptional so we don't have to specify RequiredOptional on all the conversion functions for sub-objects as well, only at the top level converter from camel-rule to snake-rule.

It's not necessary for this PR though so I'll remove it and we can try that separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this:

export type RequiredOptional<T> = { [K in keyof T]-?: [T[K]] } extends infer U
  ? U extends Record<keyof U, [unknown]>
    ? {
        [K in keyof U]: U[K][0] extends Record<string, unknown> | undefined
          ? undefined extends U[K][0]
            ? RequiredOptional<U[K][0]> | undefined
            : RequiredOptional<U[K][0]>
          : U[K][0];
      }
    : never
  : never;

works better, but I'm still not 100% sure that it handles all the type edge cases correctly. Mostly putting it here for posterity in case I or someone else comes back to try this again later.

@@ -657,7 +657,7 @@ export const commonParamsCamelToSnake = (params: BaseRuleParams) => {
output_index: params.outputIndex,
timeline_id: params.timelineId,
timeline_title: params.timelineTitle,
meta: params.meta,
meta: params.meta as { [x: string]: {} | undefined },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type casting here is quite obscure. I only managed to understand why it's necessary after switching to the branch. Recursive RequiredOptional unintentionally updates types to be an object instead of unknown.

Can we restrain from making RequiredOptional recursive?

Another option to avoid this type casting is to update generator to generate something like .catchall(z.object({})) instead .catchall(z.unknown()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the recursive update was not necessary for the discriminated unions so I removed it for this PR

Copy link
Contributor

@maximpn maximpn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marshallmain Thank you for addressing my comments!

Copy link
Contributor

@tomsonpl tomsonpl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defend Workflows (Osquery) changes lgtm 👍

@banderror
Copy link
Contributor

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 12.9MB 12.9MB +241.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @marshallmain

Copy link
Contributor

@banderror banderror left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the changes and copied the "future work" from the PR description to https://github.com/elastic/security-team/issues/7991.

This is a great improvement, thank you @marshallmain 🙏

@banderror banderror merged commit 6073eb6 into elastic:main Nov 29, 2023
30 checks passed
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:skip Skip the PR/issue when compiling release notes Team:Detection Rule Management Security Detection Rule Management Team Team:Detections and Resp Security Detection Response Team Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants