Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run gh actions on enterprise Scylla #3669

Merged
merged 7 commits into from
Jan 17, 2024
Merged

Conversation

Michal-Leszczynski
Copy link
Collaborator

Just running to see test output, but enterprise releases should definitely be included in test matrix.

@Michal-Leszczynski
Copy link
Collaborator Author

Results of running SM tests against enterprise 2024.1.0-rc1:

  • SM repaired system_replicated_keys at the end of system tables (it wasn't directly specified in repair order because it's in enterprise only) - now it is repaired after system_auth.
  • trying to start the first node in the cluster with IPV6 ends with 2023-12-21 12:21:22,270 INFO spawnerr: can't find command '/usr/bin/scylla-manager-agent'

@tzach
Copy link
Collaborator

tzach commented Dec 21, 2023

You should hit scylladb/scylladb#16349 as well

@Michal-Leszczynski Michal-Leszczynski force-pushed the ml/test-enterprise branch 2 times, most recently from 8667ec4 to bf8a2bd Compare December 22, 2023 19:02
@Michal-Leszczynski Michal-Leszczynski force-pushed the ml/test-enterprise branch 6 times, most recently from ac4644b to bbe5ec0 Compare January 11, 2024 23:52
@Michal-Leszczynski Michal-Leszczynski marked this pull request as ready for review January 12, 2024 20:34
Also, add timeout on first node setup, as misconfiguration could lead to hanging at this step.
Newer Scylla versions (e.g. 2024) docker images don't run ssh server on them own, but we require it for some of SM tests.
Because of problems with restoring backups into Scylla 5.4 with raft schema enabled (#3662), we want to test the following workaround:

- use fresh cluster without raft schema
- restore as usual
- enable raft schema in the cluster

In order to do that, we leave raft schema on src cluster and test how it works with raft schema enabled/disabled on dst cluster.
@Michal-Leszczynski
Copy link
Collaborator Author

@karol-kokoszka In commit I removed the workaround (that doesn't work in real life scenarios) for the tests to work with Scylla 5.4 with raft schema, so now it's visible in test failures. As expected, the only failed jobs are restore schema jobs with newer Scylla (5.4 / 2024.1) and raft schema enabled.

There are 2 questions regarding this PR:

  1. Should we skip restore schema jobs with configuration that we know should fail, or should we leave them as they are and validate that these are the only failed checks before merging PR? I would leave them as they are so that we keep them in mind all the time and fix them when a proper solution is delivered on the core side.
  2. This PR increases the number of jobs to 81! It seems like a lot to me, but perhaps it's fine and we shouldn't worry about that. Anyway, I think that it's not necessary to check each configuration (SCYLLA_VERSION/IP_FAMILY/RAFT_ENABLED) before merging a PR, so maybe we could remove some configurations from the matrix (e.g. don't test both IP_FAIMLY=IPV6 & RAFT_ENABLED=true and IP_FAMILY=IPV4 and RAFT_ENABLED=true)?

Apart from those questions, this PR is ready for review.

@Michal-Leszczynski Michal-Leszczynski changed the title WIP: test 2024.1.0-rc1 Run gh actions on enterprise Scylla Jan 16, 2024
@karol-kokoszka
Copy link
Collaborator

There are 2 questions regarding this PR:

Should we skip restore schema jobs with configuration that we know should fail, or should we leave them as they are and validate that these are the only failed checks before merging PR? I would leave them as they are so that we keep them in mind all the time and fix them when a proper solution is delivered on the core side.

Please keep the checks. It's a scylla-manager independent problem. Although, pls add the comment in code, that the test fails on the particular configuration.

This PR increases the number of jobs to 81! It seems like a lot to me, but perhaps it's fine and we shouldn't worry about that. Anyway, I think that it's not necessary to check each configuration (SCYLLA_VERSION/IP_FAMILY/RAFT_ENABLED) before merging a PR, so maybe we could remove some configurations from the matrix (e.g. don't test both IP_FAIMLY=IPV6 & RAFT_ENABLED=true and IP_FAMILY=IPV4 and RAFT_ENABLED=true)?
Apart from those questions, this PR is ready for review.

As long as they are free of charge, we are fine.
According to this https://github.com/orgs/community/discussions/70492, they do.
We are using ubuntu-lates, which is standard free-of-charge for public repo runner.

Copy link
Collaborator

@karol-kokoszka karol-kokoszka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , just add the comment to restore schema tests about the environment where it doesn't pass and why

Removing filtering was done so that our tests can pass with Scylla 5.4 and raft enabled, but it didn't improve the real life situations where agents don't have cross region remote location access.
@Michal-Leszczynski Michal-Leszczynski merged commit 622155b into master Jan 17, 2024
77 of 81 checks passed
@Michal-Leszczynski Michal-Leszczynski deleted the ml/test-enterprise branch January 17, 2024 12:42
@karol-kokoszka karol-kokoszka mentioned this pull request Feb 22, 2024
@karol-kokoszka karol-kokoszka mentioned this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants