-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(rosetta): improve translation throughput #3083
Conversation
Previously, Rosetta would divide all the examples to translate into `N` equally sized arrays, and spawn `N` workers to translate them all. Experimentation shows that the time required to translate samples is very unequally divided, and many workers used to be idle for half of the time, hurting throughput. Switch to a model where we have `N` workers, and we constantly feed them a small amount of work until all the work is done. This keeps all workers busy until the work is complete, improving the throughput a lot. On my machine, improves a run of Rosetta on the CDK repository with 8 workers from ~30m to ~15m.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying @RomainMuller 's comment from Slack:
Any reason not to use an existing Worker Pool solution, like https://www.npmjs.com/package/workerpool?
Apart from keeping the dependency count low and the control that comes from having the entire implementation available (easy to change parameters, memory settings, logging, etc), not really. Want me to change it? |
I'll defer to @RomainMuller , as he originally raised the point. On the positive side, |
It would be better to use an existing solution rather than rolling our own, unless you think there is a significant advantage to having our own implementation. While the initial implementation is already done, there is the question of continued maintenance which would build up if we go down this path for every new feature we want. The npm module referred to here has 3M weekly downloads, which should be sufficient indication about its reliability and updates. |
Switched to using |
@@ -192,3 +192,10 @@ Since TypeScript compilation takes a lot of time, much time can be gained by usi | |||
If worker thread support is available, `jsii-rosetta` will use a number of workers equal to half the number of CPU cores, | |||
up to a maximum of 16 workers. This default maximum can be overridden by setting the `JSII_ROSETTA_MAX_WORKER_COUNT` | |||
environment variable. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines above mention "if support is available". Update those to match the current impl.
Thank you for contributing! ❤️ I will now look into making sure the PR is up-to-date, then proceed to try and merge it! |
Merging (with squash)... |
Previously, Rosetta would divide all the examples to translate into
N
equallysized arrays, and spawn
N
workers to translate them all.Experimentation shows that the time required to translate samples is very
unequally divided, and many workers used to be idle for half of the time after
having finished their
1/Nth
of the samples, hurting throughput.Switch to a model where we have
N
workers, and we constantly feed them asmall amount of work until all the work is done. This keeps all workers
busy until the work is complete, improving the throughput a lot.
On my machine, improves a run of Rosetta on the CDK repository
with 8 workers from ~30m to ~15m.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.