Skip to content

Conversation

JAORMX
Copy link
Collaborator

@JAORMX JAORMX commented Sep 25, 2025

Summary

This PR addresses issue #2013 by significantly reducing the MCPServer CRD size from ~9500 lines to 651 lines (93% reduction).

Problem

The MCPServer CRD was too large (~9500 lines) to apply without server-side apply due to the embedded PodTemplateSpec taking up ~8500 lines. This was causing deployment issues as reported in #2013.

Solution

Changed the PodTemplateSpec field from a strongly-typed `corev1.PodTemplateSpec` to `runtime.RawExtension`, which stores the raw JSON without schema validation at the CRD level.

Key Benefits

  • 93% CRD size reduction (from ~9500 to 651 lines)
  • Full backwards compatibility - users can still use the same YAML structure
  • Runtime validation - validation now happens at runtime in the operator
  • Proper error handling - via Kubernetes events and status conditions

Changes Made

  • Modified MCPServer type to use `runtime.RawExtension` for PodTemplateSpec
  • Updated PodTemplateSpecBuilder to unmarshal and validate at runtime
  • Added event recording and status conditions for validation errors
  • Added comprehensive tests for invalid PodTemplateSpec scenarios
  • Fixed race conditions in parallel tests

Testing

Added comprehensive test coverage including:

  • Invalid PodTemplateSpec validation tests
  • Integration tests for error scenarios
  • Race condition fixes in parallel test execution

Breaking Changes

None - this change maintains full backwards compatibility.

Fixes #2013

Copy link

codecov bot commented Sep 25, 2025

Codecov Report

❌ Patch coverage is 61.41732% with 49 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.22%. Comparing base (90993f6) to head (8fc6086).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...d/thv-operator/controllers/mcpserver_controller.go 55.96% 41 Missing and 7 partials ⚠️
...thv-operator/api/v1alpha1/zz_generated.deepcopy.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2015      +/-   ##
==========================================
+ Coverage   48.17%   48.22%   +0.04%     
==========================================
  Files         233      233              
  Lines       29229    29309      +80     
==========================================
+ Hits        14082    14133      +51     
- Misses      14111    14140      +29     
  Partials     1036     1036              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX force-pushed the fix/mcpserver-crd-size-reduction branch from bf4a82f to c22bf13 Compare September 29, 2025 13:41
jhrozek
jhrozek previously approved these changes Sep 29, 2025
@jhrozek
Copy link
Contributor

jhrozek commented Sep 29, 2025

aah, I approved based on reading the code :-)

But in general the approach of not reconciling again if the podTemplateSpec is bad seems good to me.

@JAORMX JAORMX force-pushed the fix/mcpserver-crd-size-reduction branch 3 times, most recently from 49c42de to 4d28669 Compare September 30, 2025 07:40
…teSpec

The MCPServer CRD was too large (~9500 lines) to apply without server-side
apply due to the embedded PodTemplateSpec taking up ~8500 lines. This was
causing issues as reported in GitHub issue #2013.

Changed the PodTemplateSpec field from a strongly-typed corev1.PodTemplateSpec
to runtime.RawExtension, which stores the raw JSON without schema validation
at the CRD level. This reduces the CRD size from ~9500 lines to 651 lines
(93% reduction).

The solution maintains full backwards compatibility - users can still use
the same YAML structure. Validation now happens at runtime in the operator,
with proper error handling via Kubernetes events and status conditions to
notify users when invalid PodTemplateSpec data is provided.

Key changes:
- Modified MCPServer type to use runtime.RawExtension for PodTemplateSpec
- Updated PodTemplateSpecBuilder to unmarshal and validate at runtime
- Added event recording and status conditions for validation errors
- Added comprehensive tests for invalid PodTemplateSpec scenarios
- Fixed race conditions in parallel tests

Fixes #2013
Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
@JAORMX JAORMX force-pushed the fix/mcpserver-crd-size-reduction branch from 4d28669 to cd7f211 Compare September 30, 2025 11:43
@JAORMX JAORMX requested a review from jhrozek September 30, 2025 11:46
The issue was that PodTemplateSpec validation was happening early and
setting the PodTemplateValid condition, but then image validation was
also setting the ImageValidated condition without persisting it to
status. This caused the ImageValidated condition to overwrite the
PodTemplateValid condition in subsequent status updates.

The fix ensures that both validation conditions are persisted
immediately after being set:
- PodTemplateSpec validation updates status after setting condition
- Image validation now also updates status after setting condition

This ensures both conditions are present in the MCPServer status and
the invalid-podtemplatespec e2e test will pass.

Signed-off-by: Juan Antonio Osorio <ozz@stacklok.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The MCPServers CRD is too big
2 participants