Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job cacher PoC #4351

Draft
wants to merge 23 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .repository-cache-marker
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Change this file if you need to invalidate the cache.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be either invalided on exceeding maxCacheSize or if someone changes the file, e.g. incrementing 1 to 2

1
38 changes: 27 additions & 11 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,31 @@ def mavenEnv(Map params = [:], Closure body) {
timeout(120) {
withChecks(name: 'Tests', includeStage: true) {
infra.withArtifactCachingProxy {
withEnv([
'JAVA_HOME=/opt/jdk-' + params['jdk'],
"MAVEN_ARGS=${env.MAVEN_ARGS != null ? MAVEN_ARGS : ''} -B -ntp -Dmaven.repo.local=${WORKSPACE_TMP}/m2repo"
]) {
body()
}
def cacheBaseDir = env.JOBCACHER_CACHE_DIR ?: "/tmp"
def mavenRepoPath = "${cacheBaseDir}/.m2-cache-${params['cacheKey']}"
cache(
// max cache size in MB, will be reset after exceeding this size
maxCacheSize: 3072,
defaultBranch: 'master',
// don't save pull requests, only cache on master branches
// skipSave: env.BRANCH_NAME != 'master',
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit before merge

Suggested change
// skipSave: env.BRANCH_NAME != 'master',
skipSave: env.BRANCH_NAME != 'master',

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes think about the reliability: does the build fail (on master branch) if the cache fails to save, but the whole BOM build works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's an issue with the cache no it doesn't fail, if it fails to save then yes I did manage to trigger this exception in one build: #4351 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which looks to be it fails if it takes 5 mins or more to upload the cache

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Since the initial trigger of this effort (besides of course the technical efficiency and cost decrease) was to avoid slowing down or blocking the BOM team and releases, such failure would force them to deal with re-triggering builds.

Do you think it could be feasible to have a separate build in charge of seeding the cache, keeping the BOM build (even on master) to only "read" the cache and decouple both?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with that is the added costs for putting/retrieving from S3

If the S3 bucket is in the same region as the agents, then it costs no bandwidth. It's the case: we do not use multiple AWS regions.
=> We'll only pay for the storage cost which is quite low.

The BOM is one of the rare builds on ci.jio which clearly benefit from caching: the combined costs of "EC2 machines minute not needed thanks to the cache" (a gain of ~1 hour as per @timja first tests) and "the BOM maintainers not blocked and not requiring infra team to restart builds" is clearly worth!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try compressionMethod: 'TAR_ZSTD'? It should provide much better performance for speed and compression

Trying.


Do you think it could be feasible to have a separate build in charge of seeding the cache, keeping the BOM build (even on master) to only "read" the cache and decouple both?

Hmm possibly, maybe a parameterised build, it would need to skip the tests and just download dependencies (but make sure it resolves them all?)

Likely do-able but potentially we try without and see how we go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely do-able but potentially we try without and see how we go?

Fine by me if it is ok for @darinpope @alecharp @basil and @MarkEWaite

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try compressionMethod: 'TAR_ZSTD'? It should provide much better performance for speed and compression

Same problem: https://ci.jenkins.io/job/Tools/job/bom/job/PR-4351/17/pipeline-console/?start-byte=0&selected-node=4438#log-0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't actually test it, I had the compressionMethod in the wrong place, its enabled now.

caches: [
arbitraryFileCache(
// using a fixed path for Maven cache instead of the normal workspace pattern
// due to the cache name being calculated from the path
// this prevents seeding working if you use the normal pattern
path: mavenRepoPath,
cacheValidityDecidingFile: '.repository-cache-marker'
)
]
) {
withEnv([
'JAVA_HOME=/opt/jdk-' + params['jdk'],
"MAVEN_ARGS=${env.MAVEN_ARGS != null ? MAVEN_ARGS : ''} -B -ntp -Dmaven.repo.local=${mavenRepoPath}"
]) {
body()
}
}
}
if (junit(testResults: '**/target/surefire-reports/TEST-*.xml,**/target/failsafe-reports/TEST-*.xml').failCount > 0) {
// TODO JENKINS-27092 throw up UNSTABLE status in this case
Expand Down Expand Up @@ -52,17 +71,14 @@ def fullTestMarkerFile
def weeklyTestMarkerFile

stage('prep') {
mavenEnv(jdk: 21) {
mavenEnv(jdk: 21, cacheKey: "sample-plugin") {
checkout scm
withEnv(['SAMPLE_PLUGIN_OPTS=-Dset.changelist']) {
withCredentials([
usernamePassword(credentialsId: 'app-ci.jenkins.io', usernameVariable: 'GITHUB_APP', passwordVariable: 'GITHUB_OAUTH')
]) {
sh '''
mvn -v
echo "Starting artifact caching proxy pre-heat"
mvn -ntp dependency:go-offline
echo "Finished artifact caching proxy pre-heat"
bash prep.sh
'''
}
Expand Down Expand Up @@ -102,7 +118,7 @@ if (BRANCH_NAME == 'master' || fullTestMarkerFile || weeklyTestMarkerFile || env
pluginsByRepository.each { repository, plugins ->
branches["pct-$repository-$line"] = {
def jdk = line == 'weekly' ? 21 : 17
mavenEnv(jdk: jdk, nodePool: true) {
mavenEnv(jdk: jdk, nodePool: true, cacheKey: repository) {
unstash line
withEnv([
"PLUGINS=${plugins.join(',')}",
Expand Down