You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, it's impossible for kopium to detect similar structs re-used at different hierarchies because the schemas all have to be inlined and we cannot use $ref in schemas. kubernetes/kubernetes#62872
Suggestion
We should try to brute-force compare all structs (field names + types) to all previously seen structs in the analyzer to deduplicate so that we don't generate 3 different structs for things such as livenessProbe, resources, MatchLabels (under different names).
We already have the skeletons of such code, but it only checks for a few known ones like Condition here and it's enabled toggled under a flag (default enabled) --no-condition.
If we pass existing state into the extractor functions (such as extract_container) we can then inspect current state before adding duplicates (ensuring we only take the first instance of a struct) via something like is_duplicate_struct. It would also avoid having an awkward O(n^2) algorithm at the end.
My thinking is that it might not be a particularly heavy operation on (at most) a few hundred kB of schema data.
Limitations
This would not solve the problem where we are generating a type for something know; e.g. something like Condition might be deduplicated with this algorithm alone, but it would not point to the correct upstream type in e.g. k8s-openapi without futher logic.
This is probably fine however; it allows overrides to be done more easily (change / elide one struct rather than find and replace multiple), and it's a good stepping stone towards identifying them with upstream variants (such as is done for Condition).
Currently, it's impossible for kopium to detect similar structs re-used at different hierarchies because the schemas all have to be inlined and we cannot use
$ref
in schemas. kubernetes/kubernetes#62872Suggestion
We should try to brute-force compare all structs (field names + types) to all previously seen structs in the analyzer to deduplicate so that we don't generate 3 different structs for things such as livenessProbe, resources, MatchLabels (under different names).
We already have the skeletons of such code, but it only checks for a few known ones like
Condition
here and it's enabled toggled under a flag (default enabled)--no-condition
.If we pass existing state into the extractor functions (such as
extract_container
) we can then inspect current state before adding duplicates (ensuring we only take the first instance of a struct) via something likeis_duplicate_struct
. It would also avoid having an awkward O(n^2) algorithm at the end.My thinking is that it might not be a particularly heavy operation on (at most) a few hundred kB of schema data.
Limitations
This would not solve the problem where we are generating a type for something know; e.g. something like
Condition
might be deduplicated with this algorithm alone, but it would not point to the correct upstream type in e.g.k8s-openapi
without futher logic.This is probably fine however; it allows overrides to be done more easily (change / elide one struct rather than find and replace multiple), and it's a good stepping stone towards identifying them with upstream variants (such as is done for
Condition
).More Context
The text was updated successfully, but these errors were encountered: