-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rosetta): transliterate command does not translate in parallel #3262
Conversation
Make the `rosetta transliterate` command do translations in parallel. Achieve this by effectively running `rosetta extract` beforehand, which does to translations in parallel using worker threads, but don't save the results anywhere. Instead, use those results as an in-memory cache when doing the single pass. Since there shouldn't be any untranslated snippets left between extract and transliterate, set the `UnknownSnippetMode` to `FAIL`, which netted me errors showing that `extract` and `transliterate` don't look at the same fields! Specifically, initializer and parameters!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, with some minor nits.
| { | ||
readonly api: 'parameter'; | ||
readonly fqn: string; | ||
readonly methodName: string | typeof INITIALIZER_METHOD_NAME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused here... isn't typeof INITIALIZER_METHOD_NAME
also a string? I'm not sure what's going on here but obviously this is something you intended to do. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't typeof INITIALIZER_METHOD_NAME also a string?
In fact it's specifically the string '<initializer>'
, and no other string.
This is a TypeScript feature, where you can say that the type of a variable is a string literal. It's also used 2 lines up, where we say:
readonly api: 'parameter';
api
is not just any old string, but specifically the string 'parameter'
.
(Obviously in this particular case '<initializer>'
⊂ string
so the union is kinda pointless, but I'm writing it here as formal documentation)
const rosetta = new RosettaTabletReader({ | ||
includeCompilerDiagnostics: true, | ||
unknownSnippets: UnknownSnippetMode.TRANSLATE, | ||
unknownSnippets: UnknownSnippetMode.FAIL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit: if we are removing the TRANSLATE
option here, shouldn't we also remove loose: options.loose
on line 74? It doesn't change anything, but it looks like loose
for RosettaTabletReader
is used when translating missing snippets, which we are no longer doing.
Since we are not translating and loose
is an optional argument, we should remove it.
Co-authored-by: Kaizen Conroy <36202692+kaizen3031593@users.noreply.github.com>
The title of this Pull Request does not conform with [Conventional Commits] guidelines. It will need to be adjusted before the PR can be merged. |
Thank you for contributing! ❤️ I will now look into making sure the PR is up-to-date, then proceed to try and merge it! |
Merging (with squash)... |
Merging (with squash)... |
In #3262, the performance of `transliterate` was improved by running an `extract` before translating, because `extract` has the facilities to do parallel translate, and can read from a cache. However, another thing `extract` does is *discard translations from the cache* if they are considered "dirty" for some reason. This reason can be: the current source is different than the cached source, the translation mechanism has changed, compilation didn't succeed last time, or the types referenced in the example have changed. This is actually not desirable for `transliterate`, which wants to do a last-ditch effort to translate whatever is necessary to translate, but should really take what it can from the cache if there's a cached translation available. Add a flag to allow reading dirty translations from the cache.
…#3324) In #3262, the performance of `transliterate` was improved by running an `extract` before translating, because `extract` has the facilities to do parallel translate, and can read from a cache. However, another thing `extract` does is *discard translations from the cache* if they are considered "dirty" for some reason. This reason can be: the current source is different than the cached source, the translation mechanism has changed, compilation didn't succeed last time, or the types referenced in the example have changed. This is actually not desirable for `transliterate`, which wants to do a last-ditch effort to translate whatever is necessary to translate, but should really take what it can from the cache if there's a cached translation available. Add a flag to allow reading dirty translations from the cache. Also adds the `rosetta coverage` command, which I used to find out why the Construct Hub performance on the newest `aws-cdk-lib` version is still poor. --- By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license]. [Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
Make the
rosetta transliterate
command do translations in parallel.Achieve this by effectively running
rosetta extract
beforehand, whichdoes to translations in parallel using worker threads, but don't save
the results anywhere. Instead, use those results as an in-memory cache
when doing the single pass.
Since there shouldn't be any untranslated snippets left between
extract and transliterate, set the
UnknownSnippetMode
toFAIL
,which netted me errors showing that
extract
andtransliterate
don't look at the same fields! Specifically, initializer and parameters!
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.