Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(transformer): async code transformation #9889

Merged
merged 1 commit into from
Mar 13, 2021

Conversation

ychi
Copy link
Contributor

@ychi ychi commented Apr 26, 2020

Summary

Support async code transformation as mentioned in #9504

Test plan

Test cases added

@ychi ychi changed the title l [Work In Progress] Async Code Transformation Apr 26, 2020
@ychi
Copy link
Contributor Author

ychi commented Apr 26, 2020

Still very early stage, but already noticing processAsync() and getCacheKeyAsync() would require multiple layers of function to be async, many of which only one or two lines of difference from the sync version. Will try to abstract the common flows into sharable functions

@ychi ychi marked this pull request as draft April 26, 2020 16:30
@SimenB
Copy link
Member

SimenB commented Apr 26, 2020

Super exciting!

require multiple layers of function to be async, many of which only one or two lines of difference from the sync version

Yeah, I expected that. Almost all the logic will be identical, just cache lookups and the transforms themselves.

One thing is that multiple processes will probably want to transform the same files. Not a big deal in sync code, but for async code we'll quickly end up transpiling the same file a bunch of time, wasting a lot of CPU. DOes not have to be solved in this PR, but having some sort of mutex or lock per file name might make sense?

@@ -390,6 +491,135 @@ export default class ScriptTransformer {
};
}

// TODO: replace third argument with TransformOptions in Jest 26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might wanna look into landing all the jest 26 changes first, not sure... Would delay this PR so probably not

@ychi
Copy link
Contributor Author

ychi commented Apr 26, 2020

One thing is that multiple processes will probably want to transform the same files. Not a big deal in sync code, but for async code we'll quickly end up transpiling the same file a bunch of time, wasting a lot of CPU. DOes not have to be solved in this PR, but having some sort of mutex or lock per file name might make sense?

Yes that makes sense. A way to implement per file name lock might be shared module map that tracks file name, whether being transpiled, and cached transpile result?

@SimenB
Copy link
Member

SimenB commented Apr 26, 2020

Yeah, something like that 🙂 Again, not something we need in a first iteration, but something to keep in mind

@ychi
Copy link
Contributor Author

ychi commented Apr 27, 2020

Considering the following case: a file is associated with a processAsync-only transformer in the cache, but a synchronous transform on it is attempted. We can only throw an error in this case?

If code cache is shared between processes in the future, might be able to reduce this kind of error? (When the file was transformed asynchronously and cached before the sync call)

@SimenB
Copy link
Member

SimenB commented Apr 27, 2020

We can only throw an error in this case?

Yes.

If code cache is shared between processes in the future, might be able to reduce this kind of error?

I don't think we should do that, it would make tests passing dependent on execution order (or warm cache), which we don't want

@ychi ychi force-pushed the async-transform branch 2 times, most recently from ca916bf to f85e570 Compare May 3, 2020 09:57
@ychi
Copy link
Contributor Author

ychi commented May 3, 2020

Hi @SimenB, can you take a look and see it the changes are on the right direction? Also, a bit curious why tsc complaints about require('../package.json'); which I hadn't touched 🤔

@ychi ychi requested a review from SimenB May 3, 2020 15:09
@SimenB
Copy link
Member

SimenB commented May 3, 2020

Try a rebase, I've had weird warnings like that before that a rebase took care of

Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently on mobile, so a bit painful to review. Skimmed over and I like how this is shaping up! 👍

return transformer;
}

transform = await import(transformPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is to import ESM we need to add this gile to the blacklist in babel.config.js in the root of the repo.

packages/jest-transform/src/types.ts Outdated Show resolved Hide resolved
packages/jest-transform/src/types.ts Show resolved Hide resolved
options: CacheKeyOptions,
) => Promise<string>;

process: (

This comment was marked as outdated.

options?: TransformOptions,
) => TransformedSource;

processAsync?: (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not optional

@ychi ychi force-pushed the async-transform branch 2 times, most recently from 3ea5f17 to d4f36fa Compare May 10, 2020 17:03
@ychi
Copy link
Contributor Author

ychi commented May 10, 2020

Regarding tsc package.json is not under 'rootDir' complaint, it happens when import() is also used in the file. So it's not issue with git. Haven't found much discussion about this. Will play around with it more and maybe reach out to TypeScript community.

Update: there is include property available in tsconfig, using it instead of rootDir seems to work.

@codecov-io
Copy link

codecov-io commented May 10, 2020

Codecov Report

Merging #9889 (fc92ed8) into master (f73d5f6) will increase coverage by 0.16%.
The diff coverage is 98.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9889      +/-   ##
==========================================
+ Coverage   64.14%   64.31%   +0.16%     
==========================================
  Files         307      307              
  Lines       13375    13444      +69     
  Branches     3260     3281      +21     
==========================================
+ Hits         8580     8647      +67     
- Misses       4089     4090       +1     
- Partials      706      707       +1     
Impacted Files Coverage Δ
packages/jest-transform/src/ScriptTransformer.ts 72.00% <98.87%> (+8.36%) ⬆️
packages/babel-jest/src/index.ts 54.05% <100.00%> (ø)
packages/expect/src/utils.ts 94.83% <0.00%> (-1.30%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f73d5f6...fc92ed8. Read the comment docs.

@ychi
Copy link
Contributor Author

ychi commented May 18, 2020

Hi @SimenB:
What was the motivation to use rootDir over include in tsconfig.json? I think there is a bug with tsc and has reported it. By using include, we won't bump into that issue. The following packages also require package.json so might encounter similar situation somewhere down the road:

  • jest-core
  • jest-haste-map
  • jest-runtime
    Does it make sense to switch to include instead?

@SimenB
Copy link
Member

SimenB commented May 19, 2020

No particular reason. If it works I'm happy to merge a PR migrating

@ychi
Copy link
Contributor Author

ychi commented May 21, 2020

I was wrong. When we use include instead of rootDir, tsc does not complain about "outside of rootDir" but does not generate proper output for the package either :(

@SimenB SimenB added this to the Jest 27 milestone Mar 4, 2021
Copy link
Member

@SimenB SimenB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally 😅

I think this looks really good, thanks for sticking with it @ychi! Beyond the nits I've left, I wonder (as mentioned) if we should look at solving #11081 at the same time. We can probably punt on it since it'd just streamline loading transformers and is not related to the async transformation itself. done in #11163

@ychi can you add a changelog entry and merge in (or rebase, whatever you prefer) master?

content,
options,
);
const sourceMapPath: Config.Path | null = cacheFilePath + '.map';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this cannot be null - it's a const and you assign a concatenated string

null + '.map'
"null.map"

The current code sets it to null if we don't have a source map - that seems sensible. I think that condition is missing here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that setting to null statement now happens in _buildTransformResult. Removed the null type here (and the one in transformSource)

@@ -26,14 +26,20 @@ import {
} from 'jest-util';
import handlePotentialSyntaxError from './enhanceUnexpectedTokenMessage';
import shouldInstrument from './shouldInstrument';
// eslint somehow complains "There should be no empty line between import groups"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if you remove the new newline on line 42?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing that new newline works, thank you!

this._getTransformer(filename) || {};
let transformerCacheKey = undefined;

if (transformer && typeof transformer.getCacheKey === 'function') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (transformer && typeof transformer.getCacheKey === 'function') {
if (typeof transformer?.getCacheKey === 'function') {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transformerConfig,
},
);
} else if (transformer && typeof transformer.getCacheKey === 'function') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if (transformer && typeof transformer.getCacheKey === 'function') {
} else if (typeof transformer?.getCacheKey === 'function') {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, removed the duplication and using await getCacheKey now

(await this._getTransformerAsync(filename)) || {};
let transformerCacheKey = undefined;

if (transformer && typeof transformer.getCacheKeyAsync === 'function') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (transformer && typeof transformer.getCacheKeyAsync === 'function') {
if (typeof transformer?.getCacheKeyAsync === 'function') {

return cached;
}

let transformer: AsyncTransformer = await import(transformPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have a mutex here. Just a Map<Config.Path, Promise<void>> where the Path is transformPath and we can await that above the this._transformCache.get call. If we get a cache miss, we add a promise to the map there. that way the cache will be populated once.

checking if the map has a promise before awaiting will avoid the extra tick as well.

(make sure the promise we stick in the cache resolve to something empty so we don't keep a reference to something we don't need)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified, the Promise only gets resolved after cache has been set. A bit curious whether we can make cache and the mutex single source of truth while allowing transformer sync loading to work as-is 🤔

packages/jest-transform/src/ScriptTransformer.ts Outdated Show resolved Hide resolved
@@ -191,7 +313,7 @@ export default class ScriptTransformer {
return cached;
}

let transformer: Transformer = require(transformPath);
let transformer: SyncTransformer = require(transformPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need to refactor to load all transformers in some warmup step later so people can write their transformers in ESM (which cannot be required). Not sure if we should stick the refactor in here as well... Thoughts?

(To be clear, the transformer itself can be written in ESM but still expose sync transformations. Concrete example: #11081. While that talks about doing the import() in an async thing before using and the transformer itself being CJS, I think people will expect to be able to write ESM directly)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this in #11163

if (typeof processed === 'string') {
transformed.code = processed;
} else if (processed != null && typeof processed.code === 'string') {
transformed = processed;
} else {
throw new TypeError(
"Jest: a transform's `process` function must return a string, " +
'or an object with `code` key containing this string.',
'or an object with `code` key containing this string. ' +
"It's `processAsync` function must return that in a promise.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"It's `processAsync` function must return that in a promise.",
"It's `processAsync` function must return a Promise resolving to it.",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -371,6 +470,9 @@ export default class ScriptTransformer {
if (map) {
const sourceMapContent =
typeof map === 'string' ? map : JSON.stringify(map);

invariant(sourceMapPath, 'We should always have default sourceMapPath');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can ever not be set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add this because sourceMapPath might be set to null, making the type Config.Path | null in this function.

Another approach might be declaring a new local variable for return?

@SimenB
Copy link
Member

SimenB commented Mar 6, 2021

@ychi FYI I've opened up #11163 which will conflict horribly with this PR

@ychi
Copy link
Contributor Author

ychi commented Mar 6, 2021

Let me try to think through the mutex part this weekend and see if I can get ahead of the race😂

@ychi ychi force-pushed the async-transform branch 2 times, most recently from cd91c8d to 141b722 Compare March 7, 2021 10:30
@ychi
Copy link
Contributor Author

ychi commented Mar 7, 2021

@SimenB: feedback addressed and added changelog.
MacOS test run on Azure somehow failed 😬

@ychi ychi requested a review from SimenB March 7, 2021 12:07
CHANGELOG.md Outdated
@@ -2,6 +2,8 @@

### Features

-`[jest-transform]` Async code transformations ([#9889](https://github.com/facebook/jest/pull/9889))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be sorted alphabetically

return null;
}

if (this._transformCacheMutex.has(transformPath)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in #11163 will make sure all transformers are loaded before starting any transformation at all. So we don't actually need this anymore 😅

I wonder if we should have a mutex on the transformation itself, though? I.e. on the cache key... I'll have a play with that when I test this PR together with the changes in jest-runtime to make use of it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just remembered we briefly touched on this idea 😀

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, there you go! I agree with past me - both in that we should explore it and that we don't necessarily have to do anything about it before landing this PR

@SimenB
Copy link
Member

SimenB commented Mar 7, 2021

OK, landed the other one. Sorry about the conflicts 🙈 I think the changes over there makes the changes in this PR a bit cleaner though, as it centralizes "load transformer", and for async transformation we only need to deal with async cache key and async transformation.

Tell me if you want help resolving the conflicts

@ychi
Copy link
Contributor Author

ychi commented Mar 7, 2021

@SimenB Some help would be wonderful😆 I am relocating in the coming week, available time might be fragmented. If you (or anyone) wants to chime in, please feel free. Otherwise I can come back to this in mid- to late-March. 🙂

@SimenB
Copy link
Member

SimenB commented Mar 7, 2021

Squashed, removed the mutex, and rebased except for the very latest commit. That one is a bit more hairy since we have a bunch of tests which assert we throw error about missing process during transform while that has been moved to load now. So I just pushed what I'm certain about now. Will do the rest as well

@SimenB SimenB force-pushed the async-transform branch 2 times, most recently from 6a62fb2 to 7707816 Compare March 7, 2021 21:03
@SimenB
Copy link
Member

SimenB commented Mar 7, 2021

@ychi I think this came out correctly. The tests were a bit hairy, but I think I got it 😅

@SimenB SimenB requested review from thymikee and jeysal March 7, 2021 21:08
@netlify
Copy link

netlify bot commented Mar 13, 2021

Deploy preview for jestjs ready!

Built without sensitive environment variables with commit fc92ed8

https://deploy-preview-9889--jestjs.netlify.app

@SimenB
Copy link
Member

SimenB commented Mar 13, 2021

I've played with using this in the runtime, and I think we should be good to go 👍

Need to add some docs, but I can do that now.

EDIT: actually, docs should come with the change in runtime, so I'll land this now. Thanks for you patience and perseverance @ychi!

@SimenB SimenB changed the title Async Code Transformation feat(transformer): async code transformation Mar 13, 2021
@SimenB SimenB merged commit 858c50b into jestjs:master Mar 13, 2021
@SimenB SimenB mentioned this pull request Mar 30, 2021
21 tasks
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants