-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multithreaded support #787
Comments
I guess it wouldn't actually be that hard to add some multithreaded support to the interpreter. I'm slightly worried that we might discover some race conditions in the interpreter, but given the pure nature of Cryptol itself, we might actually be in good shape. I think the bigger question is: will there be enough inherent parallelism in the algorithms we care about to make it worthwhile? Research in automatic parallelism for functional languages has been fairly disappointing, as I recall. We might be able to add a parallel map primitive to let the user be more explicit about where to fork parallel tasks... for purely data-parallel applications this might be worthwhile. |
I do have something specific in mind ---building the Merkle tree for PQ Merkle signature schemes is intense, but trivially parallelizable with the parallel map primitive you suggest. |
I was curious what it would take to do this; turns out it isn't super hard (at least for the concrete evaluator). Some very initial, but encouraging results:
Speedup is clearly less than linear for this function, but is pretty substantial nonetheless. |
I wonder if some of the time is spent computing the key schedule. That only needs to happen once here, then it can be reused for every of the 101 runs. Your Now, for the second part, can you do, say, a million encrypts? Or, does Cryptol run out of memory? The final part would be, obviously (@atomb), for the AES here to be a native function. |
I tried something a bit more modest first: 10000 encrypts. This has been slowly growing memory usage and the machine is getting sluggish, so that's not necessarily great. I guess we'd probably be better off using a more explicit thread pool rather than just spawning all the threads and letting GHC's runtime deal with it. |
A little more playing around and I discovered that moving to a more map-reduce style significantly improved the memory usage.
This version seems to run in basically constant space and fills up as many cores as I give it. Hypothesis: reducing partial values early allows the memory consumed by the parallel tasks to be collected, whereas a big sequence doesn't. We might be able to improve the internals somehow to release memory once the value is computed, but it isn't immediately clear how. |
A good challenge would be to build a Merkle tree. It could be parmap (with your trick) is perfect but that Cryptol objects just use too much memory to represent such a thing for reasonable parameters. So that may be another ticket. |
For posterity, here is the simple examples I've been experimenting with:
This seems to be fixed in both |
The |
In #868, I've updated |
Is there anything holding back a multithreaded Cryptol interpreter? I think about this every time I write a
map
.The text was updated successfully, but these errors were encountered: