Use more performant core.matrix implementation in math worker #1579

metasoarous · 2022-12-20T23:02:11Z

Problem:
Back when writing the initial math implementation years ago, I wasn't able to put null/nill in the the more performant core.matrix implementations, and so decided to use plain Clojure vectors (of vectors) for the preliminary "raw" vote matrix (which needs a "missing value" sigil on participant/comment entries for which there is no data). While Clojure vectors are pretty performant structures for a lot of situations, using vectors of vectors to represent an array/matrix is really bad from a performance perspective.

What I've realized since is that the more performant implementations do allow for storing JVM NaN values in proper array data structures, which should be significantly more efficient. This should have a big impact on the "bootstrapping" time it takes to load up existing conversation data to a fresh math worker instance, and will probably have a big impact on memory consumption as well.

Suggested solution:
Switch to using NaN, together with one of the more performant core.matrix implementations (vectorz, ndarray?... I forget, will search)

Alternative suggestions:

This could be approached as part of Switch from boutique NamedMatrix impl and core.matrix to tech.ml stack for math #1062, but I think there's value to tackling this as a separate issue

The text was updated successfully, but these errors were encountered:

jucor · 2025-01-29T12:04:12Z

Thanks @metasoarous for flagging this and #1580 again today as we head into some bigger conversations :) Acknowledged and discussed with @colinmegill , will dive into this.
Here are the notes of our chat:

Worst-case impact on service

No data loss. It's mainly suboptimal from a comment routing perspective.
For the big conversation, main issue is that groups and comment routing freeze.
Secondary issue is that other conversations get blocked up and we start getting reports that the report either isn't showing or is stuck.
Would have to look at the server to see how it handles this, but @metasoarous thinks it just means the probabilities are frozen/stuck. Unsure what probability a comment that does [not?] have a routing priority would get (if not effectively zero).

Technical bottleneck

It's not even the math itself, just the loading into the data structure at the beginning of the math worker.

Short-term mitigation

Can be mitigated significantly by running a larger node. It's also possible to set up a separate instance using MATH_ZID_ALLOWLIST and ... BLOCKLIST variables, and with tuning of the kill timeout in bin/run. We can exclude a big conversation from the main worker instance on heroku, and set up a separate server that only runs that zid. So at least it is isolated.

Deeper mitigation

As discussed in your first post in this issue, or in Use more performant core.matrix implementation in math worker #1579
Another option, that you hint at in Switch from boutique NamedMatrix impl and core.matrix to tech.ml stack for math #1062 as "going all in right away", is to rewrite the math worker. If so, while we're at it, and given the changing landscape in languages and ML community since the original clojure implementation, this might need to be in another language, probably Python. Not a small task by any means, but after years of discussing it, this might give us the final impetus for it.

metasoarous added feature-request For new feature suggestions 🔩 p:math clojure Clojure (math module) related issues 🚀 scale-and-perf Performance and scale issues labels Dec 20, 2022

metasoarous mentioned this issue Dec 20, 2022

Review NamedMatrix update routine for performance #1580

Open

metasoarous removed the feature-request For new feature suggestions label Dec 20, 2022

jucor mentioned this issue Jan 30, 2025

[Work in Progress DO NOT MERGE] Port math library from clojure to python #1893

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use more performant core.matrix implementation in math worker #1579

Use more performant core.matrix implementation in math worker #1579

metasoarous commented Dec 20, 2022

jucor commented Jan 29, 2025 •

edited

Loading

Use more performant core.matrix implementation in math worker #1579

Use more performant core.matrix implementation in math worker #1579

Comments

metasoarous commented Dec 20, 2022

jucor commented Jan 29, 2025 • edited Loading

jucor commented Jan 29, 2025 •

edited

Loading