-
Notifications
You must be signed in to change notification settings - Fork 830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix problem with running the DSL in multiple threads #751
base: master
Are you sure you want to change the base?
Fix problem with running the DSL in multiple threads #751
Conversation
We use GPars eachParallel() to split up generation of jobs into multiple threads. This significantly reduced our run time from 25min to 5min. Unfortunately, it leads to a race condition, which causes random jobs to be lost (and deleted) on each run. Ultimately, it all boils down to this non thread-safe statement in JobParent.processItem(): referencedJobs << job Making the 5 collections asSynchronized() fixed the problem for us.
Thank you for this pull request! Please check this document for how the Jenkins project handles pull requests. |
Script execution time is becoming an issue as projects grow. The DSL was not designed to be thread-safe. Applying this change would probably fix the concurrency problems for you, but would not make the DSL thead-safe in general. And I think it would be a major effort to build a thread-safe DSL. I want to explore other means for speeding up execution time before addressing concurrency issues. Can you take a look at #782 and test if that change reduces script execution time for you? |
Well, if I understand correctly, #782 only helps if you have multiple scripts, but we have only one script which iterates over >300 repositories, and creates >1000 jobs. |
Our seed job generates ~1000 in 90 seconds, so I guess that the time-consuming part is iterating over >300 repositories. I assume that you are using something like the GitHub API to list all repos and branches. And that's slow, not the DSL itself is slow. Can you confirm? If your approach works for you, that's fine. But if we declare the DSL thread-safe, user's will come up with "interesting" ways of using it, e.g. def startJob = job('start-all') {
// ...
}
doInParallel { branch ->
job("${branch}-job") {
// ...
}
startJob.with {
publishers {
downstream("${branch}-job")
}
}
} That will cause problems in PublisherContext. So other collections would need to be thread-safe as well. And then there is the map containing the IMHO trying to fix all concurrency issues is a bottomless pit. Which I want to avoid if possible. |
My pointer to PublisherContext was wrong, the example will cause concurrency problems with the list of configure blocks in Item.groovy#L6. All top-level methods are implemented as configure blocks. |
90 secs, that's fast. I just checked, we create 2580 items (jobs, folders and views) in ~5min (runtime of our seed script) plus ~2min runtime of the job-dsl-plugin (after our script finished). We use Github API to iterate over repositories, yes, but unfortunately that's not the only bottleneck. Before parallelizing anything, runtime was >25min, but if I parallelize only Github API calls and synchronize on DSL commands, runtime still increases to 10min (+2min). Your "interesting" example is, yes, interesting :) Fortunately we don't do anything like that, we do not share jobs between threads, they stay in the thread where they are created. Do you see any concurrency problems if we limit ourselves to this? I guess |
Yes, It would be nice if we could improve the single-thread performance of Job DSL. I never optimized that and I have no idea what the hot spots are. Can you attach a profiler to Jenkins or take some thread dumps while your seed job is running? |
Can one of the admins verify this patch? |
Just for the record, we successfully use this patch now for 2 years (we have to manually apply it for every new release of the plugin). There never were any problems. |
Build finished. |
We use GPars eachParallel() to split up generation of jobs
into multiple threads. This significantly reduced our
run time from 25min to 5min.
Unfortunately, it leads to a race condition, which causes
random jobs to be lost (and deleted) on each run.
Ultimately, it all boils down to this non thread-safe statement
in JobParent.processItem():
referencedJobs << job
Making the 5 collections asSynchronized() fixed the problem for us.