Skip to content

Conversation

fealebenpae
Copy link
Contributor

Summary

Proof of Work

Checklist

  • Have you linked a jira ticket and/or is the ticket in the title?
  • Have you checked whether your jira ticket required DOCSP changes?
  • Have you added changelog file?

@fealebenpae fealebenpae self-assigned this Aug 1, 2025
Copy link
Contributor

@lsierant lsierant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work overall! I've left few comments.


mdbVersion, err := semver.ParseTolerant(rs.Spec.Version)
if err != nil {
log.Warnf("Failed to parse MongoDB version %q: %w. Proceeding without the automatic creation of the searchCoordinator role that's necessary for MongoDB <8.2", rs.Spec.Version, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we return false here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. If we return false then we won't mirror the keyfile from Ops Manager into the secret that mongot will be waiting for. If customers are using a customized MongoDB version it might either not need the searchCoordinator role polyfill, or they might apply it themselves in the MongoDB spec, so not returning false if we can't parse the MongoDB version doesn't prevent that very convoluted scenario from actually working with Search.

On the other hand, I can't realistically imagine when we might fail parsing the MongoDB version, so I can't decide if this case is actually worth guarding against.

return true
}

func (r *ReconcileMongoDbReplicaSet) mirrorKeyfileIntoSecretForMongot(ctx context.Context, d om.Deployment, rs *mdbv1.MongoDB, log *zap.SugaredLogger) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could make it part of the search reconciler. It feels a bit awkward to have this special code for making the search reconciler work properly. I feel we should limit changes only to the setParameters. We know the secret name and we can look and mirror it when we need it while reconciling MongoDBSearch resource.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reviewing the whole PR I think it's not even necessary as we already have mirroring on the search side in controllers/search_controller/mongodbsearch_reconcile_helper.go:142

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to read the keyfile contents from Ops Manager. Unfortunately in Enterprise the only copy of the keyfile is in the automation config in Ops Manager, unlike in Community where the Operator generates the keyfile and stores it in a secret we can just mount in mongot. I considered mirroring the keyfile into a secret inside the search reconciler, but then we'd need to establish the Ops Manager connection there, so I felt the replica set reconciler would be a smaller pain.

FWIW we don't mirror the keyfile on the search side - the ensureSourceKeyfile method on the search reconcile helper fails the search resource reconciliation until the replica set controller actually creates the keyfile secret, and applies a hash annotation to mongot's pod template so that if the keyfile gets rotated mongot will be restarted.

var search *searchv1.MongoDBSearch
searchList := &searchv1.MongoDBSearchList{}
if err := r.client.List(ctx, searchList, &client.ListOptions{
FieldSelector: fields.OneTermEqualSelector(search_controller.MongoDBSearchIndexFieldName, rs.GetNamespace()+"/"+rs.GetName()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we the assumption that MongoDBSearch must be in the same namespace as MongoDB resource?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that we don't copy/mirror it for community, so we don't need it for enterprise. We can just mount it directly. But this will only work if we limit it for the same namespace as we cannot ref secrets across namespaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently assume that the two resources are in the same namespace in Community as well, this is mostly a copy of the same code in the Community replicaset reconciler. I believe all we need to support different namespaces is to start mirroring the keyfile and CA secrets in the Search namespace in Community.

return &MongoDBSearchReconciler{
kubeClient: kubernetesClient.NewClient(client),
mdbcWatcher: &mdbcWatcher,
mdbcWatcher: watch.New(),
Copy link
Contributor

@lsierant lsierant Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocking: we cannot have these watchers in the controller instance. It must be placed in MongoDBSearchReconcilerHelper as the controller is a singleton executed concurrently by the runtime in case we have more than one reconcile at a time configured.

Also the resource watcher here is not made to be thread safe and we should migrate to the one from MEKO packages, which have locks (controllers/operator/watch/resource_watcher.go).

Also #2, I thought we will migrate away from using those resource watchers in favor of using indexes?

Copy link
Contributor Author

@fealebenpae fealebenpae Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I tried repurposing field indexes for watches as described in Kubebuilder's last-version docs and it works pretty neatly for resource we have a direct relationship against. We can generate reconcile requests for changes in MongoDBCommunity resources by checking them against the index.

However, where this breaks down for me is watching related resources like the CA or keyfile secrets since we don't build indexes against them. I was looking into transitive events, as in "if the CA or keyfile secrets change, shouldn't that cause at least a temporary change in the MongoDBCommunity status that will trigger the search controller's mdbc field index watch?" but I didn't have the time to verify that assumption. That's why I fell back to using watches.

Field indexes aside, I'll switch to the thread-safe watcher used in ReconcileCommonController, thanks for pointing out the issue.

return src
}

func TestEnterpriseResourceSearchSource_Validate(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how exhaustive the cases are! 👏


def assert_search_enabled(self):
try:
result = self.client.admin.command("getCmdLineOpts")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a reliable test if we connect to the primary in the replicaset mode? We won't check other instances.

Do you think it would be better to ssh directly into all pods and check autogenerated mongod config there if the parameters are set in each pod (we can use KubernetesTester.run_command_in_pod_container for cat-ing the config probably)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, thanks! I'll check run_command_in_pod_container out.

@fixture(scope="function")
def sample_movies_helper(request, mdb: MongoDB) -> SampleMoviesSearchHelper:
credentials = (USER_NAME, USER_PASSWORD)
if request.node.get_closest_marker("use_admin_user"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test_search_restore_sample_database step is decorated with the used_admin_user marker which causes the sample_movies_helper fixture to return a helper configured with the admin user credentials instead of the regular user.

I found out that the restore role can be granted on the admin database only (the automation config failed to apply when I requested a user with the restore role on the sample_mflix database) so I opted to use the admin user instead to restore the database, that's why the restore step needs the admin user credentials instead of the regular user.

Copy link
Contributor

@lsierant lsierant Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found out that the restore role can be granted on the admin database only (the automation config failed to apply when I requested a user with the restore role on the sample_mflix database)

This is worrying. Does it mean that Ops Manager has some more AC validations than the agents doing exactly the same thing in a headless mode in community?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also usage of 'get_closest_marker("use_admin_user")' is making my head hurt. I'd like to avoid using such pytest features and try to organize the code in a more explicit way. Can we do it differently? We don't need to use fixtures at all for sample_movies_helper. In any call site we can just execute it with get_sample_movies_helper function directly.

Also the way we're using marks is to execute whole e2e tests, so adding @mark.use_admin_user might suggest we're adding a whole new e2e test (evergreen task) here.

@@ -17,3 +17,7 @@ export CUSTOM_OM_VERSION

export CUSTOM_MDB_PREV_VERSION=6.0.16
export CUSTOM_MDB_VERSION=7.0.5

export MDB_SEARCH_COMMUNITY_VERSION="fbd60fb055dd500058edcb45677ea85d19421f47" # Nolan's fixes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's extract this to one file similarly like we have

source "${script_dir}/variables/mongodb_latest" in e2e_mdb_community
as we have at least three copies of it and it will be a mess soon, especially that we have also this image details specified in values.yaml and release.json.

Feel free to create a ticket for it if you feel it doesn't belong in this PR.

@lsierant lsierant force-pushed the fealebenpae/enterprise-search branch from 7dedfee to ef4880a Compare August 23, 2025 21:24
@lsierant lsierant force-pushed the search/public-preview branch from d8e903f to 446c8b3 Compare August 26, 2025 08:54
@fealebenpae fealebenpae force-pushed the fealebenpae/enterprise-search branch from ef4880a to eb64e78 Compare August 28, 2025 10:26
@lsierant lsierant force-pushed the fealebenpae/enterprise-search branch from eb64e78 to ef4880a Compare August 28, 2025 10:56
fealebenpae and others added 4 commits August 28, 2025 13:40
…ebenpae/enterprise-search

# Conflicts:
#	.evergreen-tasks.yml
#	controllers/operator/mongodbsearch_controller.go
#	controllers/search_controller/mongodbsearch_reconcile_helper.go
#	controllers/search_controller/mongodbsearch_reconcile_helper_test.go
#	controllers/search_controller/search_construction.go
#	mongodb-community-operator/controllers/replica_set_controller.go
@lsierant lsierant force-pushed the fealebenpae/enterprise-search branch from 90ac29e to f783f5d Compare August 28, 2025 20:35
@lsierant lsierant mentioned this pull request Aug 28, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants