You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Problem:
Back when writing the initial math implementation years ago, I wasn't able to put null/nill in the the more performant core.matrix implementations, and so decided to use plain Clojure vectors (of vectors) for the preliminary "raw" vote matrix (which needs a "missing value" sigil on participant/comment entries for which there is no data). While Clojure vectors are pretty performant structures for a lot of situations, using vectors of vectors to represent an array/matrix is really bad from a performance perspective.
What I've realized since is that the more performant implementations do allow for storing JVM NaN values in proper array data structures, which should be significantly more efficient. This should have a big impact on the "bootstrapping" time it takes to load up existing conversation data to a fresh math worker instance, and will probably have a big impact on memory consumption as well.
Suggested solution:
Switch to using NaN, together with one of the more performant core.matrix implementations (vectorz, ndarray?... I forget, will search)
Thanks @metasoarous for flagging this and #1580 again today as we head into some bigger conversations :) Acknowledged and discussed with @colinmegill , will dive into this.
Here are the notes of our chat:
Worst-case impact on service
No data loss. It's mainly suboptimal from a comment routing perspective.
For the big conversation, main issue is that groups and comment routing freeze.
Secondary issue is that other conversations get blocked up and we start getting reports that the report either isn't showing or is stuck.
Would have to look at the server to see how it handles this, but @metasoarous thinks it just means the probabilities are frozen/stuck. Unsure what probability a comment that does [not?] have a routing priority would get (if not effectively zero).
Technical bottleneck
It's not even the math itself, just the loading into the data structure at the beginning of the math worker.
Short-term mitigation
Can be mitigated significantly by running a larger node. It's also possible to set up a separate instance using MATH_ZID_ALLOWLIST and ... BLOCKLIST variables, and with tuning of the kill timeout in bin/run. We can exclude a big conversation from the main worker instance on heroku, and set up a separate server that only runs that zid. So at least it is isolated.
Another option, that you hint at in Switch from boutique NamedMatrix impl and core.matrix to tech.ml stack for math #1062 as "going all in right away", is to rewrite the math worker. If so, while we're at it, and given the changing landscape in languages and ML community since the original clojure implementation, this might need to be in another language, probably Python. Not a small task by any means, but after years of discussing it, this might give us the final impetus for it.
Problem:
Back when writing the initial math implementation years ago, I wasn't able to put null/nill in the the more performant
core.matrix
implementations, and so decided to use plain Clojure vectors (of vectors) for the preliminary "raw" vote matrix (which needs a "missing value" sigil on participant/comment entries for which there is no data). While Clojure vectors are pretty performant structures for a lot of situations, using vectors of vectors to represent an array/matrix is really bad from a performance perspective.What I've realized since is that the more performant implementations do allow for storing JVM
NaN
values in proper array data structures, which should be significantly more efficient. This should have a big impact on the "bootstrapping" time it takes to load up existing conversation data to a fresh math worker instance, and will probably have a big impact on memory consumption as well.Suggested solution:
Switch to using NaN, together with one of the more performant
core.matrix
implementations (vectorz, ndarray?... I forget, will search)Alternative suggestions:
The text was updated successfully, but these errors were encountered: