-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): cache fingerprints of large assets #21321
Conversation
c6142f8
to
3664ad6
Compare
Changed to chore as the PR linter doesn't like a feature that doesn't touch the README. Let me know if |
de789cf
to
d858559
Compare
const hash2 = FileSystem.fingerprint(largefile, {}); | ||
|
||
expect(hash1).toEqual(hash2); | ||
expect(openSyncSpy).toHaveBeenCalledTimes(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should restore the mock or clear all mocks in a beforeEach()
. Because it is not cleared you end up with 3 calls in your considers mtime
test. If this test is moved up in the file for some reason it will fail (test should not rely on testing order).
Also, is there a reason to create a large file for this test? It is the caching mechanism that is tested so it should work on files of any size?
expect(openSyncSpy).toHaveBeenCalledTimes(1); | |
expect(openSyncSpy).toHaveBeenCalledTimes(1); | |
openSyncSpy.mockRestore() |
same for the second test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, we don't need to use a large file now that this is not sensitive to timing or filesize thresholds. I'll update that as well as adding a mock restore call.
@@ -84,6 +96,13 @@ export function fingerprint(fileOrDirectory: string, options: FingerprintOptions | |||
} | |||
|
|||
export function contentFingerprint(file: string): string { | |||
const stats = fs.statSync(file); | |||
const cacheKey = JSON.stringify({ mtime: stats.mtime, inode: stats.ino, size: stats.size }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a comment here explaining why this is safe on Windows
d858559
to
4b8ecd9
Compare
When fingerprinting large assets, hashing the asset can take quite a long time - over a second for a 300MB asset, for example. This can add up, particularly when generating multiple stacks in a single build, or when running test suites that bundle assets multiple times, and is not avoidable by asset caching (since it's computing the cache key). This change caches the result of digesting individual files based on the inode, mtime, and size of the input file. This feature improved the runtime of one of our slowest tests by ~10%. closes: aws#21297
4b8ecd9
to
fc2be6c
Compare
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
mtime_unix: stats.mtime.getUTCDate(), | ||
mtime_ms: stats.mtime.getUTCMilliseconds(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not doing what is intended:
getUTCDate()
returns the UTC day-of-the-month of the Date objectgetUTCMilliseconds()
returns the UTC milliseconds component of the Date object (i.e: fractional seconds)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch (and fix!)
When fingerprinting large assets, hashing the asset can take quite a long time - over a second for a 300MB asset, for example. This can add up, particularly when generating multiple stacks in a single build, or when running test suites that bundle assets multiple times, and is not avoidable by asset caching (since it's computing the cache key). This change caches the result of digesting individual files based on the inode, mtime, and size of the input file. This feature improved the runtime of one of our slowest tests by ~10%. closes: aws#21297 Note: No README entries were added, because this sub-subsystem was already not documented in the README. ---- ### All Submissions: * [x] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/main/INTEGRATION_TESTS.md)? N/A
When fingerprinting large assets, hashing the asset can take quite a
long time - over a second for a 300MB asset, for example. This can add
up, particularly when generating multiple stacks in a single build, or
when running test suites that bundle assets multiple times, and is not
avoidable by asset caching (since it's computing the cache key).
This change caches the result of digesting individual files based on the
inode, mtime, and size of the input file.
This feature improved the runtime of one of our slowest tests by ~10%.
closes: #21297
Note: No README entries were added, because this sub-subsystem was already not documented in the README.
All Submissions:
New Features