Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix replset and sharding integration tests #743

Merged

Conversation

h-haaks
Copy link
Contributor

@h-haaks h-haaks commented Apr 11, 2024

Fixing integration test that haven't been run for a very long time.
To be able to run integration test that require two hosts with gha I have created a custom ci workflow and set it to unmanaged in modulesync.

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch 7 times, most recently from f2f3d31 to 0d7c5d4 Compare April 15, 2024 23:37
@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 15, 2024

Finally got this working locally!
I added some tests to verify replset setup as part of mongodb::server as well.

Still don't know if it's possible to run these tests in the gha test matrix ...

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch 4 times, most recently from 821b18e to 11c6bfe Compare April 16, 2024 11:41
@h-haaks h-haaks marked this pull request as ready for review April 16, 2024 11:49
@h-haaks h-haaks changed the title Fix replset acceptance testing Fix replset and sharding integration tests Apr 16, 2024
README.md Outdated Show resolved Hide resolved
@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 16, 2024

@stevenpost Could you have a look at this?

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch 12 times, most recently from 61deabd to b6bb845 Compare April 17, 2024 07:52
@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch from 33f2132 to fd05f15 Compare April 17, 2024 09:33
.sync.yml Outdated Show resolved Hide resolved
@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 17, 2024

Finally got all integration tests to run.
There seems to be some instability in the replset tests with mongodb 7.0 and auth enabled.
Rerunning the failing tests a few times fixes it.

@stevenpost
Copy link
Contributor

Finally got all integration tests to run. There seems to be some instability in the replset tests with mongodb 7.0 and auth enabled. Rerunning the failing tests a few times fixes it.

I noticed the CentOS 8 tests failing a lot due to timeouts, nothing to do with the mongodb puppet module, but with the servers. I had a colleague trigger those again to make the pipeline pass.

@bastelfreak
Copy link
Member

We pull the centos 8 image from quay.io and that seems to trigger random timeouts :(

@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 18, 2024

@stevenpost, I'm not talking about the quay.io issues. I'm aware of that :)
If you look at https://github.com/voxpupuli/puppet-mongodb/actions/runs/8719641032/job/23921963409?pr=743 it has 3 re-runs.
See each of the runs ( dropdown button to the right ) and you'll see what I'm talking about.

@stevenpost
Copy link
Contributor

@stevenpost, I'm not talking about the quay.io issues. I'm aware of that :) If you look at https://github.com/voxpupuli/puppet-mongodb/actions/runs/8719641032/job/23921963409?pr=743 it has 3 re-runs. See each of the runs ( dropdown button to the right ) and you'll see what I'm talking about.

I see what you mean, and it isn't limited to MongoDB 7.0.
The members of the set are defined at line 214 (https://github.com/voxpupuli/puppet-mongodb/blob/76858968bab1a981639e8925a15346992fafcd12/spec/acceptance/replset_spec.rb#L214C11-L214C70), Do you know what the values are for x? They should map to the facts networking.fqdn, otherwise you get the error:

Notice: /Stage[main]/Main/Mongodb_replset[test]/ensure: created
Warning: Host rocky9-64-2-puppet7.example.com:27017 is available, but you are unauthorized because of authentication is enabled: true
Error: /Stage[main]/Main/Mongodb_replset[test]: Could not evaluate: rs.initiate() failed for replicaset test

Followed by

Warning: User info is available only from master host

Even though at this point, both nodes are standalone.
You are then hitting this if statement in the provider:

status = if host.split(':').first == Facter.value(:fqdn)
going into the else branch, which won't work with authentication enabled.

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch from fd05f15 to f9957b3 Compare April 20, 2024 21:47
@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 21, 2024

My last commit changed the outcome of the integration tests quite a bit.
Now there is no job failing on rs.initiate() any more.

The problem here seems to be that when rs.initiate() is invoked with more than one member its up to mongodb election to decide which server should be the primary(master).
The test code assumes that the host with master role ( the first host ) allways is elected as primary, which is not always the case.

When the second host is elected primary, the idempotency check on host 1 now fails auth to host 2 ...

@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 21, 2024

Personally I'm not sure I like the overall design of the module when it comes to setting up replication.
I don't think I have seen puppet implementations that have to cross connect between multiple servers to get its job done ...

If I was to reimplement this I think I would somehow use exported resources for replicaset members and let puppet on the currently elected primary realize the exported the hosts trough the localhost connection.

@stevenpost
Copy link
Contributor

Personally I'm not sure I like the overall design of the module when it comes to setting up replication. I don't think I have seen puppet implementations that have to cross connect between multiple servers to get its job done ...

If I was to reimplement this I think I would somehow use exported resources for replicaset members and let puppet on the currently elected primary realize the exported the hosts trough the localhost connection.

I think most modules should strive to be independent of PuppetDB, not everyone has that luxury. That being said, I agree that the current approach is suboptimal.

When the second host is elected primary, the idempotency check on host 1 now fails auth to host 2 ...

Then the check is wrong? I haven't looked at this PR it in detail yet, but some things come to mind:

  • Most checks can be done on a secondary node, this is great for idempotency.
  • Changes need to be done on the master
  • Adding or removing a node from the replica set needs to be done on the master
    If we have a replicaset, perhaps it is ok to have the Puppet run on other nodes do the actual change. in the real world any change gets propagated throughout the replicaset anyway.

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch from f8b2735 to 7f36dac Compare April 21, 2024 22:33
@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 21, 2024

Finally I got this stabilized :)
To summarize:
7f36dac is needed to make sure the fist host is always elected as primary ( master )
24bb055 is needed to prevent misleading errors saying rs.initiate() failed for replicaset #{name} when initiation actually is successful but another host is elected as primary.

I still see lots of issues with how these replset parts are implemented in the module.
I'll try to put this up for discussion in a separate issue.

@h-haaks
Copy link
Contributor Author

h-haaks commented Apr 21, 2024

Regarding f9957b3, I really want to wait for voxpupuli/puppet_metadata#124 and voxpupuli/gha-puppet#53 so that I can remove it before this is merged.

@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch from 7f36dac to 89365d9 Compare April 25, 2024 21:45
@h-haaks h-haaks force-pushed the fix-replset-acceptance-testing branch from 89365d9 to 9636c73 Compare April 25, 2024 22:19
@bastelfreak bastelfreak added the bug Something isn't working label Apr 25, 2024
@h-haaks h-haaks merged commit e790633 into voxpupuli:master Apr 25, 2024
116 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants