-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check50.check(dynamic=True) #230
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do a more thorough review later. These are just some nitpicks.
One general concern is I think this would arguably make more sense as a module-wide decision rather than a per-check decision. I'd guess that this would also substantially simplify the implementation.
@attr.s(slots=True) | ||
class CheckSideEffects: | ||
state = attr.ib(default=None) | ||
new_checks = attr.ib(default=attr.Factory(list)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just do attr.ib(factory=list)
# - It's dependending on a dynamic check | ||
check._is_dynamic = dynamic \ | ||
or internal.check_running \ | ||
or (check._check_dependency and check._check_dependency._is_dynamic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just do or getattr(check._check_dependency, "_is_dynamic", False)
|
||
|
||
def register_check(self, check): | ||
self.checks.append({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be an attrs
class or at least a namedtuple rather than just a dict.
This was something I tried out too, in fact here's a working implementation in an old commit: 702652e. I scrapped the API call later on, because in most situations marking an It seemed weird to me that you'd need to choose between the static/dynamic, especially if only a couple of checks want to pass something unpickle-able for instance. Even more so because static/dynamic can happily co-exist. You'd be surprised I think, allowing both to exist doesn't add all that much complexity if at all. |
Adressing #222
Backstory
check50 currently runs checks in parallel through multiprocessing. The idea is simple, once a check passes, all of its dependents are started simultaneously. Take for instance the following set of checks:
Once
A
passes, bothB
andC
will start running. And once those pass, their dependents start to run too. But perhaps this is easier to grasp visually:Do note that this is a simplified representation of check50s execution order, in practice check
B
might take much longer thanC
to run. That could causeF
and/orG
to start or even finish beforeB
finishes.Theoretically this model works wonders if checks spend most of their time waiting on IO, or if checks are lengthy and there is a machine with many cores available. In practice, this often isn't the case.
Motivation
In practice not every set of checks benefits all that much from concurrency. But the current design does force multiprocessing on everything. This makes it:
Approach
Make check50 single process again. Sort of. I've iterated on this design quite a bit and landed on the following:
A dynamic check runs in one process in order of definition. That's it. Jumping back to the example earlier. Let's mark
C
as dynamic:Here's what happens:
All dynamic checks run after static (old-style) checks. Each check that depends on a dynamic check or is created (more on this soon) by a dynamic check is automatically marked as dynamic itself. So no need to mark
F
andG
as such.Dynamic checks have a deterministic execution order, so there's no need to re-import the checks' module in between checks anymore. In fact, if all checks are dynamic, the checks' module is imported exactly once. 🎊
State between dynamic checks is shared in memory. There's no serialization, anything can be returned and received.
Dynamic checks can create other checks. For example, instead of:
https://github.com/cs50/problems/blob/7b83decd08252ce0d9b85fedc2281d0e01fea74b/hello/__init__.py#L14-L25
But far more interesting, through dynamic checks it is easy to wrap an existing testing tool, like Python's unittest or JUnit. Here's a meta unittest for check50, where we now run unittest to check whether check50 can create checks from a unittest module. You'll get it once you see it:
https://github.com/cs50/check50/blob/single_process/tests/checks/dynamic/python_unittest/__init__.py
Implementation
Dynamic checks use a queue for execution order. Once a dynamic check is created with the
check50.check
decorator, it is placed at the end of the queue. There's no need to add the check to the module scope.I've dubbed "importing the module to find all checks": check discovery. This now runs as a job in its own separate process, The check's module is no longer imported in the main check50 process. This process is later re-used for dynamic checks (to import the module just once).
The old hack with
runner._check_names
is replaced with a new hackrunner.CHECK_REGISTRY
. That in similar spirit keeps track of all defined checks, but now also allside_effects
(state / newly created checks) from checks. To wrap your head around this, imagine there's two sides, static and dynamic. The static side has access to static data: check names,descriptions, etc. and static side effects. The dynamic side has access to dynamic data, the actual checks in memory and dynamic side effects.In case you're worried, it is quite difficult to create a circular dependency through dynamic checks. The
check50.check
decorator forces the dependency to exist (be a check50 check) before it can be used as a dependency of another check. So there's no chicken and the egg question here, the dependency must exist prior to the dependent. And because dependencies are not mutable (unless you hack your way in) you can't go back and introduce a circular dependency.Tests
Many: https://github.com/cs50/check50/blob/single_process/tests/dynamic_tests.py
Docs
None yet. Let me know what you think first! :)