Refactor and expand onepass model #300

sjfleming · 2025-02-05T17:19:28Z

Closes #163
Closes #296

This is a refactor of onepass to make it more extensible. It implements the Welford algorithm for online variance calculation and it also implements a gene-gene covariance computation via an online algorithm similar to Welford (https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Covariance subsection "Online"). The latter can be run using the raw data or ranks (which allows for a computation of gene-gene Spearman correlations).

Currently the Welford implementation is actually implemented in a different cellarium class, as is the covariance implementation. I thought this might be cleaner than having one huge class with more input arguments, but I'm open to opposing views. Also, the class heirarchy worked a lot better when Welford was a separate class, since Welford keeps track of different sufficient statistics than the naive/shifted algorithms. And the Welford-like gene-gene covariance keeps track of the same sufficient statistics as Welford (plus more).

Todo:

…device

sjfleming · 2025-02-06T05:40:16Z

Welford covariance computation on all 33k genes (using a batch size of 10k cells) spikes to like 31GB memory usage on my laptop (eyeballing Activity Monitor). Just wanted to make a note of this as a ballpark figure. I am guessing that's why the github actions runner went OOM on 5495ce0. The CLI test now uses a Filter and only computes covariance using a handful of genes.

For speed purposes, the rule seems to be "make the batch as big as you can accommodate in memory".

sjfleming · 2025-02-19T19:39:30Z

Ensure batch is not empty in update()

sjfleming added 13 commits February 4, 2025 15:08

initial notes

0381e46

Merge branch 'main' into sf-onepass-gene-gene-correlation

b7cf483

sketch out abstractions

e71040a

fixes from formatting and typechecking

43c9f8f

refactor passes existing onepass tests

1f75a55

add Welford variance tests and make them pass

fb1bc64

test covariance computation (spearman correlation failing) on single …

7e27ed9

…device

formatting

fd96955

remove spearman cov from tests and say NotImplemented

3587e29

fix test problem

0535e1e

add CLIs and CLI tests

635fef6

linting

5495ce0

limit full covariance test genes to reduce memory use

f080090

sjfleming added 3 commits February 6, 2025 00:52

Merge branch 'main' into sf-onepass-gene-gene-correlation

b622c41

example config for welford covariance

8e13fc1

fewer genes in test

c507cd3

sjfleming added 2 commits February 21, 2025 00:50

Ensure all forwards validate input genes

800a9ad

more elaborate example

a1f134e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor and expand onepass model #300

Refactor and expand onepass model #300

sjfleming commented Feb 5, 2025 •

edited

Loading

sjfleming commented Feb 6, 2025 •

edited

Loading

sjfleming commented Feb 19, 2025

Refactor and expand onepass model #300

Are you sure you want to change the base?

Refactor and expand onepass model #300

Conversation

sjfleming commented Feb 5, 2025 • edited Loading

sjfleming commented Feb 6, 2025 • edited Loading

sjfleming commented Feb 19, 2025

sjfleming commented Feb 5, 2025 •

edited

Loading

sjfleming commented Feb 6, 2025 •

edited

Loading