Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add service to run maintenance jobs #1034

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

mnonnenmacher
Copy link
Contributor

Add a maintenance job to deduplicate packages in the database. See the commit messages for details.

@mnonnenmacher mnonnenmacher force-pushed the maintenance-jobs branch 4 times, most recently from 22e8c9f to 8c327dd Compare September 16, 2024 13:04
@mnonnenmacher mnonnenmacher marked this pull request as ready for review September 16, 2024 13:04
Copy link
Contributor

@oheger-bosch oheger-bosch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, this will work for the problem at hand (although the deduplication of projects is not addressed).
However, I am a bit uneasy with introducing this new mechanism in such a way without a clear design. We should be prepared that the whole code might be thrown away in the future.

services/maintenance/build.gradle.kts Show resolved Hide resolved
services/maintenance/src/main/kotlin/MaintenanceService.kt Outdated Show resolved Hide resolved
services/maintenance/src/main/kotlin/MaintenanceJob.kt Outdated Show resolved Hide resolved
@@ -50,6 +54,14 @@ fun Application.configureLifecycle() {
syncRoles(authorizationService)
}
}

thread {
MDC.setContextMap(mdcContext)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Will this be removed again when the job is done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's stored in a ThreadLocal so it will die with the thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I meant was the whole block to add the job to the maintenance service. When the migration is complete it is not needed any more. If the job is started, it will probably still do some (minor) background activity.
I guess, the underlying problem here is that the minimum maintenance service implementation does not provide a mechanism to add/remove jobs dynamically; therefore, starting of jobs has to be hard-coded.

The index is required to speed up the package deduplication introduced
in a later commit.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
When searching for an identical package, always return the match with
the lowest id. This helps with the deduplication logic introduced in a
later commit, as the deduplication will also keep the package with the
lowest id.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
Add the model and a database table for maintenance jobs. This will
initially be used for database clean-up tasks that take a long time to
finish. The table can be used by those jobs to store progress
information to be able to continue the work if the server was restarted
before the job could finish.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
Add an initial implementation of the `MaintenanceService` to run
maintenance jobs. This is required to run the package deduplication
which will be added in the next commit.

This first version is sufficient for the package deduplication but it
has several shortcoming that will have to be addressed later:

* The service runs on core, on the long term maintenance jobs should run
  on a separate node.
* It only supports jobs which run exactly once, support for running jobs
  multiple times or on a schedule is missing.
* The mechanism to prevent that a job is executed in parallel is based
  on a heuristic and needs to be replaced with a more sophisticated
  approach.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
Move a helper function to the `testFixtures` to make it available for
tests in other modules.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
Due to an issue fixed in fb57726, duplicates of packages were stored in
the database. The new mechanism to prevent duplicates is very slow if
the database contains a lot of packages (>1,000,000), so add a
maintenance job to remove the duplicates.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
Run the maintenance service in core and add the initial job to
deduplicate packages.

Signed-off-by: Martin Nonnenmacher <martin.nonnenmacher@bosch.com>
oheger-bosch
oheger-bosch previously approved these changes Sep 19, 2024
@@ -50,6 +54,14 @@ fun Application.configureLifecycle() {
syncRoles(authorizationService)
}
}

thread {
MDC.setContextMap(mdcContext)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what I meant was the whole block to add the job to the maintenance service. When the migration is complete it is not needed any more. If the job is started, it will probably still do some (minor) background activity.
I guess, the underlying problem here is that the minimum maintenance service implementation does not provide a mechanism to add/remove jobs dynamically; therefore, starting of jobs has to be hard-coded.

@mnonnenmacher mnonnenmacher marked this pull request as draft September 19, 2024 10:48
@sschuberth
Copy link
Contributor

Will be replaced by #1073

So should this be closed now, @mnonnenmacher?

@mnonnenmacher
Copy link
Contributor Author

Will be replaced by #1073

So should this be closed now, @mnonnenmacher?

I would like to keep this open to act as a template, because we will have to implement some other maintenance tasks, for example, to move the Keycloak sync out of core.

@mnonnenmacher mnonnenmacher changed the title Deduplicate packages in the database Add service to run maintenance jobs Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants