Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for dependency caching #2383

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from
Draft

Conversation

mbg
Copy link
Member

@mbg mbg commented Jul 22, 2024

Work-in-progress to add support for dependency caching to the init Action.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

@mbg mbg self-assigned this Jul 22, 2024
Copy link
Contributor

@henrymercer henrymercer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far, some initial comments:


interface CacheConfig {
paths: string[];
hash: string[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "key" is more descriptive? Consider adding some short doc to explain what these fields do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added docs to this, but I have left it as hash for now. key may be misleading, since the cache key is comprised of more than just the hash calculated from the files that match these patterns. Perhaps hashFilePatterns would be better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong feelings, I think with the doc hash is fine.

src/dependency-caching.ts Outdated Show resolved Hide resolved
const cacheConfig = CODEQL_DEFAULT_CACHE_CONFIG[language];

if (cacheConfig === undefined) {
logger.info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be debug statements before we merge this PR

// with an empty string.
const globber = await makeGlobber(cacheConfig.hash);

if ((await globber.glob()).length === 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we list files twice (once here and once in cacheKey)? Might not be a performance problem in practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does. I looked at this before your review but, unfortunately, the hashFiles implementation in @action/glob isn't exposed in a way that we can just throw an existing array of paths at it, so we'd have to copy the implementation (or a variant of it, depending on how much we care about the intricacies of theirs).

continue;
}

const size = await getTotalCacheSize(cacheConfig.paths, logger);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the size is very small, should we skip the upload?

  • Pro for skipping: for small sizes, caching may slow down the run
  • Pro for uploading: this will let us deal better with registry outages or network issues

Whichever way we choose, let's document why.

Another consideration: if the size is very large, should we also skip? Larger caches are more likely to push out other cache entries. Another thing we could do is look at the existing cache usage and selectively upload based on that. This is something we can address in a followup PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have noted this in the description, but I have left this kind of thing out of the implementation for now while we are in the early stages of testing to stick to just a minimal viable implementation. I agree that we should look at this before we merge / ship anything and your points are good.

If the size is very small, should we skip the upload?

My feeling is that we probably shouldn't skip the upload if the cache is small. As you say, the benefit of dealing with third-party outages is useful, and storing/restoring a small cache should be fairly quick.

Another consideration: if the size is very large, should we also skip? Larger caches are more likely to push out other cache entries. Another thing we could do is look at the existing cache usage and selectively upload based on that. This is something we can address in a followup PR.

Agreed. I think I touched on this in the EDR as well.

src/init-action-post.ts Outdated Show resolved Hide resolved
src/init-action.ts Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants