-
Notifications
You must be signed in to change notification settings - Fork 402
Overlay: add automation ID to cache key #3080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds automation ID to the overlay-base database cache key so that we properly distinguish different analyses in the same repo for the same language. Since I am changing the cache key format, I also moved the CodeQL bundle version to the end of the cache restore key, in case we want to remove it from the restore key sometime in the future. Note that I chose to leave CACHE_VERSION unchanged because the old and the new cache keys are sufficiently different that there should be no risk of confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enhances the overlay-base database caching mechanism by adding automation ID to the cache key to properly distinguish different analyses in the same repository for the same language. The change restructures the cache key format and includes hashing of additional components while moving the CodeQL bundle version to the end for future flexibility.
- Adds automation ID to cache key components for better analysis differentiation
- Introduces a hashing mechanism for cache key components to maintain manageable key length
- Restructures the cache key format and makes getCacheRestoreKey async
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/overlay-database-utils.ts | Implements the core changes: adds automation ID to cache key, converts getCacheRestoreKey to async, and introduces component hashing |
src/overlay-database-utils.test.ts | Updates tests to mock getAutomationID function for the new async cache key generation |
lib/init-action.js | Generated JavaScript compilation output reflecting the TypeScript changes |
lib/analyze-action.js | Generated JavaScript compilation output reflecting the TypeScript changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thank you for proactively tackling this problem! I agree with your reasoning for keeping the cache version the same as well.
I only have a few minor, non-blocking comments.
src/overlay-database-utils.ts
Outdated
const sha = await getCommitOid(checkoutPath); | ||
return `${getCacheRestoreKey(config, codeQlVersion)}${sha}`; | ||
const restoreKey = await getCacheRestoreKey(config, codeQlVersion); | ||
return `${restoreKey}${sha}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I am not sure I realised until now is that restoreCache
is happy to use the primary cache key as a prefix for restoring a cache. I think I was previously under the assumption that only the partial restore keys (if any) were prefix-matched. It might be worth adding a comment for this somewhere, since it might otherwise be non-obvious how restoring the cache can work if sha
is included here, but not when restoring the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about it for quite a while. What is an effective way to highlight the prefix matching used for cache restore?
I ended up making three changes:
- Rename functions and variables to highlight the difference between "save key" and "restore key"
- Append "Prefix" to the restore key function and variable names to highlight the fact that they are key prefixes
- Add comment that the save key consists of the restore key prefix followed by the checkout SHA
Hopefully that will make things clearer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking the time to think about this and come up with a few changes to make this clearer! Out of those three changes, I think appending "Prefix" to the names has made the most difference, since it more clearly communicates that it is intentionally a prefix of the cache key.
I probably would have liked a comment (e.g. on getCacheRestoreKeyPrefix
) that notes that the primary key is prefix-matched and therefore omitting sha
works fine. The subtlety here is that restoreCache
has separate parameters for the "primary key" (a string; which you use with for the cache key prefix) and the "restore keys" (a string array; which you don't use right now). Although semantically they seem to work as if the primary key could just be the first element of restore keys, they are separated by both the cache
library and Action. The existing comment for getCacheRestoreKeyPrefix
talks about "restore keys" in "Actions cache supports using multiple restore keys" which can be interpreted as being about the "restore keys" parameter rather than the "primary key" one, so it's not clear from this that the "primary key" can also be a prefix.
This is somewhat pedantic and definitely not blocking for this PR!
This commit updates componentsJson computation to call JSON.stringify() without the replacer array and documents why the result is stable.
This PR adds automation ID to the overlay-base database cache key so that we properly distinguish different analyses in the same repo for the same language.
Since I am changing the cache key format, I also moved the CodeQL bundle version to the end of the cache restore key, in case we want to remove it from the restore key sometime in the future.
Note that I chose to leave
CACHE_VERSION
unchanged because the old and the new cache keys are sufficiently different that there should be no risk of confusion.Changes in this PR has been validated in an internal test repository.
Risk assessment
For internal use only. Please select the risk level of this change:
Merge / deployment checklist