[wip][clustering] visualization dashboard by mivanit · Pull Request #170 · goodfire-ai/spd

mivanit · 2025-09-25T13:41:10Z

Description

wip, marked as "ready for review" for getting CI tests to run

see #198

Related Issue

#133

Motivation and Context

How Has This Been Tested?

Does this PR introduce a breaking change?

…rker" This reverts commit 3f55ffa.

see #227 (comment)

…line-cfg

see discussion at #227 (comment) have tried to make this change as isolated as possible -- i think this was a useful feature and we may want to add it back at some point

old name didnt make sense, since it should be a path to a file with a `ClusteringRunConfig` see #227 (comment)

miv-goodfire · 2025-10-24T17:03:09Z

commenting because @mivanit has reduced perms on the repo so CI won't run. hopefully interacting will enable them to run?

EDIT: commenting wont work. committing to the PR branch will though

…/clustering-dashboard

* clustering (#43) squash Squashing #43 to a clean branch for clustering work #43 (comment) # Commits * refactor to use MergeRunConfig everywhere * Wip * wip * wip * some info to yaml * todos * LFGgsgsgs * wip * refactor configs. new system with ability to choose sampling mechanism. code will be pushed later * refactor makefile to use new configs * add tests. code still not comitted yet * format * fix tests * test configs * TESTS PASSIN LFGGGG * format * pyright fixes * fix some pyright issues * parallelizing tests * distributed tests in CI * fix action * try to make tests faster * remove experiment with no canonical run * try to debug issue with normalizing ensemble works on my machine :( * [important] remove old files * move the merge pair samplers code to math folder * fix import in tests * default to cpu if no cuda in spd-cluster * default to cpu if no cuda in spd-cluster * wip wandb logging for spd-cluster refactor * more wip wandb logging for spd-cluster refactor * format? * wandb log tensor info * wandb log tensor info wip * wandb log tensor info wip * wandb log tensor info wip * some figs on wandb * format * wip * [temp] ignore config rename/deprecated warns * wip * wip * wip * wip * wip * wip * wip * wip * wip * wip * [hack] lack of canonical runs causing error * wip * wip, cleaning up s2 * wip * nathan approved MDL cost computation * swap to log2 in mdl * log MDL cost to wandb, other minor fixes * wip * wip * wip * some logging stuff fixed. now tryna figure out why way too many merge hists are loaded * [important] got run loading to work right!!! * wip * format * [important] FEATURE COMPLETE?? * format * minor changes to wandb stuff * fix tests? * format * fix pyright errors, some warnings remain * pyright passing!!! * removed old ignores in pyproject.toml, fixed the resulting errors * ? * fix * fix registry * numprocess to auto for pytest with xdist * move the TaskName definition * remove pyright ignore * move clustering stuff into clustering folders * format * fix type error * no more dep graph * remove outdated TODO.md * remove old ruff exclude * remove unused function * remove `from __future__ import annotations` * factor out component filtering, add tests for it * tensor stats randomized for big matrix * wip * minor fix related to passing filtered labels * histograms as wandb Images instead of plotly * wip * factor out some logging stuff, lots of minor changes * wandb log semilog of merge pair cost * oops, fix semilog * log hists for stuff * close figures to avoid memory leaks, add some logging to help profile * wip, added some logging * logging stuff * pyright fixes * make the merge history more lightweight - get rid of stats dicts, we store everything in wandb anyway - get rid of old sweep stuff, we keep it in config anyway - some code used only in experiment notebooks is broken * wip * format * intervals dict * fix issue with zip file closed * wip * remove merge profiling code * get rid of some old comments * more explanation around popping #43 (comment) * components_in_pop_grp -> n_components_in_pop_grp #43 (comment) * remove ignore deprecated config warnings todo see #43 (comment) this will probably cause a lot of warnings to be logged unless canonical runs updated? * some typing fixes * fix pyright issues by pinning `transformers` package see #139 * fix annoying mock test issue * fix pyright issue * remove call to deprecated plotting function * remove dead test * give sigmoid type * update default ss decomp run * wip * give sigmoid type * wip * fix interface, store ComponentModel.module_paths maybe it can be a property that gets the keys? and asserts that keys in components and gates match? * patched model -> target model when using config * fix: no more patched_model, use target_model for config * comments * fix cuda_memory_used.py * removed `spd/clustering/math/dev.py` #43 (comment) * `StatsKeys` -> `StatsKey` #43 (comment) * remove `plot_merge_history_costs` #43 (comment) * remove old js-embedding-vis dep * Oli clustering refactor (#172) # PR #172 Commit List #172 ## Initial Work by oli-clive-griffin - [7c488fc](7c488fc) - wip - [b506574](b506574) - wip - [56f6b8a](56f6b8a) - remove srt - [8682bec](8682bec) - wip - [e3645c5](e3645c5) - wip - [b246b31](b246b31) - wip - [b44abe8](b44abe8) - it runs! ## Merge and Formatting - [f83ed30](f83ed30) - Merge branch 'feature/clustering' into feature/clustering-refactor-v1 - [7700e6e](7700e6e) - format ## Continued Refactoring by oli-clive-griffin - [3ea89ae](3ea89ae) - wip - [bc37ae3](bc37ae3) - wip - [e30d680](e30d680) - wip - [d16cb96](d16cb96) - wip - [231af5a](231af5a) - wip - [988a9c8](988a9c8) - wip - [7665879](7665879) - wip - [b01130e](b01130e) - wip - [067573a](067573a) - wip ## Integration and Fixes by mivanit - [1311a54](1311a54) - Merge branch 'feature/oli-cluster' into feature/clustering-refactor-v1 - [28b6e01](28b6e01) - format fixes - [b0304d9](b0304d9) - remove old s2 script step - [ac10010](ac10010) - wip (oli-clive-griffin) - [fce4007](fce4007) - pyright passing - [238fa31](238fa31) - fix pih tests - [94f3f92](94f3f92) - better path handling - [2b4feee](2b4feee) - reorg of pipeline - [901cebd](901cebd) - fixing tests - [60ed9d9](60ed9d9) - wip (oli-clive-griffin) - [ce01a15](ce01a15) - wip ## Storage Implementation - [a04abba](a04abba) - Merge branch 'demo/cluster-storage' into feature/clustering-refactor-v1 - [b829f3e](b829f3e) - only need storage stuff - [cc202ce](cc202ce) - wip - [f2ecda1](f2ecda1) - wip storage - [eb7a30d](eb7a30d) - move storage.py to pipeline - [b4ae5de](b4ae5de) - docstring with tree - [8970478](8970478) - make some stuff private in storage - [55ba126](55ba126) - use ClusteringStorage everywhere - [c85e357](c85e357) - Remove unused old path logic - [7e66a3a](7e66a3a) - add back model_dump_with_properties - [62dfcdf](62dfcdf) - fix tests - [a0a3b2e](a0a3b2e) - Revert config change - [1f95caf](1f95caf) - add storage tests - [d860f01](d860f01) - simplify imports ## Final Cleanup and Testing - [0b94ef1](0b94ef1) - format - [e850e41](e850e41) - format and type check fixes - [7e83333](7e83333) - re-add notebooks as tests - [1a1e521](1a1e521) - fix configs - [338b761](338b761) - allow toml config files - [e809810](e809810) - wip - [6633e69](6633e69) - wip - [987e1f0](987e1f0) - better logging - [0c9e52d](0c9e52d) - wip - [18b424d](18b424d) - no default for n samples - [9f7321d](9f7321d) - add a todo ## Final Refinements - [dc86bae](dc86bae) - add back BatchedGroupMerge util methods - [079b3f2](079b3f2) - add link in comment - [9f75588](9f75588) - inline alive_mask as ~dead_components - [73a3378](73a3378) - wip - [46d4498](46d4498) - from_file() -> read() on RunConfig - [227f6ba](227f6ba) - rename: RunConfig -> ClusteringRunConfig - [b6869c3](b6869c3) - add back comments - [22acf57](22acf57) - more comments in merge.py - [e83bc6a](e83bc6a) - comment edit - [67af4a6](67af4a6) - wip - [f9c702e](f9c702e) - fix clustering failing by reverting to shelling out inst. of pool - [7f1516a](7f1516a) - Revert "fix clustering failing by reverting to shelling out inst. of … - [0db0379](0db0379) - revert to shelling out, fix logging - [e935a78](e935a78) - delete old code - [69f0bc8](69f0bc8) - [!!!] tests passing # commit list * wip * wip * remove srt * wip * wip * wip * it runs! * format * wip * wip * wip * wip * wip * wip * wip * wip * wip * format fixes * remove old s2 script step * wip * pyright passing * fix pih tests * better path handling * reorg of pipeline * fixing tests * wip * wip * only need storage stuff * wip * wip storage * move storage.py to pipeline * docstring with tree * make some stuff private in storage * use ClusteringStorage everywhere * Remove unused old path logic * add back model_dump_with_properties * fix tests * Revert config change * add storage tests * simplify imports * format * format and type check fixes * re-add notebooks as tests * fix configs * allow toml config files * wip * wip * better logging * wip * no default for n samples * add a todo * add back BatchedGroupMerge util methods * add link in comment * inline `alive_mask` as `~dead_components` pretty sure claude changed this for no reason at some point * wip * from_file() -> read() on RunConfig * rename: RunConfig -> ClusteringRunConfig * add back comments * more comments in merge.py * comment edit * wip * fix clustering failing by reverting to shelling out inst. of pool - originally we would shell out for the different clustering processes, and use a separate fd passing json - then we switched to multiprocessing.pool - this would cause model loading to fail -- not sure why - this reverts it to the old style there are still some old files laying around, theyll be removed * Revert "fix clustering failing by reverting to shelling out inst. of pool" This reverts commit f9c702e. reverting temporarily * revert to shelling out, fix logging * delete old code * [!!!] tests passing --------- Co-authored-by: Oliver Clive-Griffin <o.clivegriffin@gmail.com> * fixes/improvements to dist_utils * type fixes? * wip * Sync everything from feature/clustering-dashboard except spd/clustering/dashboard/ * minimizing diff * minimize pyprojec.toml diff * minimizing diff, removed deps * uv sync * fixing state dict mappings * test parallelization * parallelize tests * device getting utils * add TaskName type * uv sync (pytest-xdist dep) * remove old junk from Makefile * globally unique ports in tests to allow parallelization * comments explaining port allocation in tests * add distributed marker, rull all distributed tests on same worker * Revert "add distributed marker, rull all distributed tests on same worker" This reverts commit 3f55ffa. * add distributed marker, rull all distributed tests on same worker * refactor: use general_utils methods for getting device everywhere * wip jaccard * wip jaccard * wip jaccard * wip jaccard (plotting) * found where to increase timeout * wip jaccard * make format * fixes * typing fixes * claude doing a bunch of type hinting * trying to get pyright passing? this doesnt cause any issues locally :/ * pyright works both locally and in CI * allow installing cpu-only torch in CI * figure out CI disk usage by tests on main * alternate strategy for install basically: - torch and torchvision in a `pytorch` dep group - `pytorch` dep group, along with `dev`, is installed by default (uv sync behavior unchanged) - `make install-ci` recipe which first manually installs torch stuff, then uv sync but NO `pytorch` dep group * fixes to the last commit * cleanup temp changes * make in CI * wip * wip * wip * uv sync * try to fix markupsafe? * pin markup safe with explanation * update lockfile?? * nope i think we need the index strategy * ? * markupsafe issue * remove disk usage printing * fix pyright issue * dependency hell * fix deps??? * oops, missing index strategy. moved to makefile * re-lock * make from /usr/bin/ ? * dependency hell * type checking hell * Update spd/utils/general_utils.py Co-authored-by: Dan Braun <dan.braun@goodfire.ai> * wrap and fix Conv1D imports * minimize diff cleanup * try compile-bytecode for ci install * dont compile bytecode actually compare: - no compile: https://github.com/goodfire-ai/spd/actions/runs/18316145235/job/52156789078 - yes compile: https://github.com/goodfire-ai/spd/actions/runs/18316319784/job/52157420005 approximately no speedup in tests or basedpyright run, but extra 10s added to install step (where the compilation happens) * remove markupsafe constraint? * switched to use get_obj_device * remove device: torch.device type hints see #186 (comment) * remove "distributed" test marker * fix another timeout * replace get_module_device -> get_obj_device * better comments on port uniqueness * remove old markers port uniqueness should resolve the issue, without causing slowdowns * remove timeout TODO comments * removed checks.yaml timeout todo, clustering tests pass in ~12min * [diff-min] transformers version issue from #139 resolved * fix comment * wip jaccard * pyright fixes to jaccard, wip * Update docs about grad syncing with DDP * Mention feature/memorization-experiments in README * Fix train and eval metrics and hidden_act_recon (#189) * Update canonical runs and change target model path (#197) * Avoid using too many processes in tests * fix wandb model paths to older runs `goodfire/spd/runs/{id}` -> `goodfire/spd-pre-Sep-2025/runs/{id}` see 5ba1d24 --------- Co-authored-by: Oliver Clive-Griffin <o.clivegriffin@gmail.com> Co-authored-by: Dan Braun <dan.braun@goodfire.ai> * timing cluster_ss.py * Revert "timing cluster_ss.py" This reverts commit 26c2957. * [clustering] `cluster_ss.py` speedup ci (#199) ## Description clustering tests were slow (see [`actions/runs/18406085786/job/52446229079`](https://github.com/goodfire-ai/spd/actions/runs/18406085786/job/52446229079), taking nearly 10 minutes [Fig 1]. In particular, `tests/clustering/test_clustering_experiments.py::test_cluster_ss_notebook` was taking ~500 seconds, nearly 10x the next slowest test. This test just runs `tests/clustering/scripts/cluster_ss.py` as a script. Turns out >90% of that time was spent on the dataset [Fig 2], so we switched it to be streaming. this required a minor refactor. Note that after we fixed the above, `test_cluster_ss_notebook` was fast, but streaming was not enabled for `test_clustering_with_simplestories_config` since that tests it thru cli -- hence the downloading was just happening for the latter. so, we adjust the config and cli to allow for enabling dataset streaming in see [17bfdc4](17bfdc4) ### Fig 1 ``` ============================= slowest 10 durations ============================= 504.48s call tests/clustering/test_clustering_experiments.py::test_cluster_ss_notebook 66.45s call tests/clustering/test_clustering_experiments.py::test_clustering_with_resid_mlp1_config 48.46s call tests/test_distributed.py::TestDistributedDeterminicity::test_distributed_determinicity 47.01s call tests/clustering/test_clustering_experiments.py::test_clustering_with_simplestories_config 17.16s call tests/clustering/test_clustering_experiments.py::test_cluster_resid_mlp_notebook 6.18s call tests/metrics/test_alive_components_distributed.py::TestDistributedAliveComponentsTracker::test_distributed_alive_components 5.36s call tests/test_gpt2.py::test_gpt_2_decomposition_happy_path 4.67s call tests/test_wandb_run_loading.py::test_loading_from_wandb[tms_5-2-id-wandb:goodfire/spd/runs/dwalcejo-_from_run_info] 4.56s call tests/test_wandb_run_loading.py::test_loading_from_wandb[resid_mlp2-wandb:goodfire/spd/runs/6vpsvdl5-_from_pretrained] 4.37s call tests/test_wandb_run_loading.py::test_loading_from_wandb[tms_5-2-wandb:goodfire/spd/runs/4stmo6p5-_from_run_info] ``` > from https://github.com/goodfire-ai/spd/actions/runs/18406085786/job/52446229079 ### Fig 2 ``` Timer records (s): Load model 15.36 Load data 511.21 Load data batch 0.05 Get component activations 5.11 Process activations 0.08 Plot activations 8.05 Compute merge iterations 0.51 compute distances 0.18 plot distances 0.01 ``` > from https://github.com/goodfire-ai/spd/actions/runs/18407424091/job/52450587755 ## Overall Impact - total CI runtime reduced from 12 minutes to 4 minutes. time for specifically tests reduced from 11 min to 3 min. - before: https://github.com/goodfire-ai/spd/actions/runs/18406085786/job/52446229079 - after: https://github.com/goodfire-ai/spd/actions/runs/18410271787/job/52460485800 - a few minor backwards compatible interface additions - allow specifying `dataset_streaming` for `"lm"` tasks - format specifier for saving figures (I thought svg might be a bit faster. left it in because seems useful generally) ## Related Issue See #43 (comment) ## Does this PR introduce a breaking change? No, but did modify a few interfaces. # Commits: * timing cluster_ss.py nearly identical to 26c2957 but accidentally committed that to clustering/main * better timing * fix timing * allow saving in formats besides pdf * streaming dataset for notebook to avoid download? * oops * remove timers * minimize diff * remove custom CI timing step * dataset streaming in config+cli for spd-cluster test_cluster_ss_notebook was fast, but streaming was not enabled for test_clustering_with_simplestories_config since that tests it thru cli -- hence the downloading was just happening for the latter. so, we adjust the config and cli to allow for enabling dataset streaming * wip * cuda issues??? * [temp] telemetry for action using https://github.com/catchpoint/workflow-telemetry-action * minor fixes from claude see #199 (comment) * [temp] no docker container for workflow telemetry to work * sudo in workflow for apt-get * use --dist worksteal in CI pytest * revert temp CI changes * minor fixes. accidentally removed container in CI lol * link to issue in script we do the weird thing because of CUDA issues. we should remove it at some point, and make sure that the CUDA worker catches it #201 (comment) * add num_nonsingleton_groups stat from PR170 see #170 * switch BaseModel to BaseConfig, get rid of old save/read logic (#209) adapt `clustering/main` and the associated #198 to use the new `BaseConfig` introduced in #200 * switch BaseModel to BaseConfig, get rid of old save/read logic * fix typo * fix pydantic validation issue * use model_copy to avoid editing frozen dict when updating ClusteringRunConfig from CLI * remove deprecated config fields * switch to new run * wip sigmoid issues * that worked...? * dont assert positive coacts? * get rid of long-running merge pair sampler on GPU test well... somewhere in the range of 0.5 to 1s, but its the longest running test out of all the standard tests * [clustering] Refactor to two-stage process (#203) * Have clustering be submitted via slurm jobs * Functional run_clustering.py script * Functional submitter (though workspace broken) * Refactors * Move pipeline_config.yaml to configs/ * Add non-used config entries to spd/clustering/configs/example.yaml * --n_runs --> --n-runs * Update run paths * Add calc_distances script * Fix run_pipeline wandb workspace * Have MergeConfig inherit from BaseConfig * Remove unused tensor_stats.py * Improve script docs and output file naming * Use existing random id generator which uses secrets * run_clustering -> main * Get tests passing (and remove some old tests) * Remove REFACTOR.md * Avoid loading run info twice * Add comment to test_run_clustering_happy_path * remove matplotlib related type ignore comments causing warns likely related to [f9a2a97](f9a2a97) / #210 * re-add --dist-worksteal in CI see https://github.com/goodfire-ai/spd/pull/203/files#r2425944229 * pdf_prefix -> figure_prefix (minimizing diff) * Add calc_distances to pipeline * Explicitly state distance plot file in pipeline * move ClusteringRunConfig to separate file run_clustering.py is very long, plus this helps us minimize the diff * [temp] rename {merge,clustering}_run_config for easier diff * re-add dataset_streaming option for fast CI tests * remove spd/clustering/utils it was empty * add `make clean` recipe * [wip] add back storage.py * wip * wip * make format * `Command` object continuing refactor, `make test` passing * try to fix some tests * working on fixing tests, some config issues, wip * ExecutionStamp reworking, it will manage output dirs * CI install fix -- python version used was 3.12, now we are on 3.13 * sigmoid typing fixes !!! unsure abt wht to use from CIOutputs * catch exception when checking git repo is clean * remove merge pair sampler GPU test * allow errors when calling repo_is_clean * wip debugging * fix?? * try/except workspace new view creation * fix create_clustering_workspace_view * just dont pass a wandb project to tests * config woes * wip * noneable slurm config * fix cli * wip * move `Command` to own class * storage refactor * wip * ensemble id keys must be unique if registered in db, pass None * snapshot creation, cli stuff * logging cleanup * Remove Command and use shlex * Remove Storage properties --------- Co-authored-by: Michael Ivanitskiy <mivanits@umich.edu> * [clustering] distance computation (#213) add `matching_dist` and `matching_dist_vec` distance computations * wip * better debugging plots * wip * wip * wip * format * pyright fixes * wip * wip * [!!!] remove old testing code * integrate distances with the rest of the pipeline * wip * wip * wip * wip * dev recipe * fix pyright warning * wip * wip * wip * wip * make format * cli control for calc_distances `time` command doesnt work in CI or on andromeda * rework configs, make tests simpler and faster * fix config typo distances_method -> distances_methods * [clustering] config refactor (#227) ## Core changes - `ClusteringRunConfig.idx_in_ensemble` is removed entirely. index is now auto assigned - ~~pipeline config can contain merge run config either inline or as path~~ - ~~quite annoying when doing experiments to have to go and edit two files~~ - no longer a feature, but path will be validated on config load. see discussion in #227 (comment) - remove component popping - change brought over from #206 via commit [a1f1146](a1f1146) - clustering runs will now actually use the run id for the wandb run id - spd decomp runs don't yet ## Housekeeping - separate folders for pipeline configs and merge run configs - clearer to the user, and enables testing configs without trying to infer what type they are - more validators on the configs - some config tests, importantly validating all configs in the `spd/clustering/configs/` dir - often was previously the case that config schema or var names would change, old configs would go out of date, and this would not be immediately obvious - tests for ensemble registry ## Key questions: - [x] is this way of handling "incline config" vs "path to config" fine, or should I do it the more pydantic-y way? - only path to config, see #227 (comment) - [x] we can in principle now get rid of `idx_in_ensemble` entirely, since the database read/write should handle uniqueness of indexes. I have kept it in for now, but happy to get rid of it - got rid of it, see [40df505](40df505) - ~~when a config already exists at the expected path, should we compare strings or the loaded object? the former is probably faster since pydantic validations add overhead. also, hash collisions are vanishingly unlikely here~~ N/A, removed inline config functionality Commits: * allow specifying either config path or mrc cfg in pipeline cfg * [wip] reorg configs * added default `None` for slurm partition and job name prefix * refactor configs, add config tests * fix tests * allow `None` or `-1` idx_in_ensemble - idx_in_ensemble is None iff ensemble_id is None - idx_in_ensemble == -1 will make register_clustering_run() auto-assign next avalible index - added tests for ensemble registry * whoops, wrong name on fixture * fix idx passed in tests when not needed * rename "mrc" -> "crc" in paths I forgot its no longer called "MergeRunConfig" * rename merge_run_config.py -> clustering_run_config.py * fix pyright * fix idx_in_ensemble being passed in tests * rename cache dir 'merge_run_configs' -> 'clustering_run_configs' * remove component popping changes brought in from PR #206 branch clustering/refactor-multi-batch commit [9cbb52f](9cbb52f) * dont pass batch size, change not brought in here * fix history_path extension and storage usage * dev pipeline * better config validation tests * set default base output dir * wandb use run id for clustering, TODO for spd decomp * basedpyright 1.32.0 causes issues, esp w/ wandb https://github.com/goodfire-ai/spd/actions/runs/18719611602/job/53388090437 * remove idx_in_ensemble, always auto-assigned now see #227 (comment) * only allow passing clustering run config path, not inline see discussion at #227 (comment) have tried to make this change as isolated as possible -- i think this was a useful feature and we may want to add it back at some point * rename run_clustering_config_path -> clustering_run_config_path old name didnt make sense, since it should be a path to a file with a `ClusteringRunConfig` see #227 (comment) * deps * uv lock * fix merge? * Add get_cluster_mapping.py and temp backward compatibility fixes * Add more clustering configs * spd-cluster -> spd-clustering for consistency * Remove muutils * Delete various clustering tests * Misc clean * some refactoring of slurm utils * Refactor slurm utils * Address AI comments on refactor * Remove run_locally out of slurm.py * Don't assume .env exists * misc cleanups * Fix test mocks * Stop harvesting if datasaet is depleted * Add clustering CLAUDE.md --------- Co-authored-by: Oliver Clive-Griffin <o.clivegriffin@gmail.com> Co-authored-by: Dan Braun <dan.braun@goodfire.ai>

mivanit marked this pull request as draft September 25, 2025 13:41

mivanit mentioned this pull request Sep 30, 2025

clustering #43

Merged

28 tasks

mivanit changed the base branch from main to feature/clustering October 6, 2025 09:30

mivanit and others added 27 commits October 6, 2025 14:02

comments explaining port allocation in tests

88022ea

Merge branch 'refactor/clustering-prereqs' into feature/clustering

5bcd8a3

Merge branch 'main' into feature/clustering

0e38131

add distributed marker, rull all distributed tests on same worker

3f55ffa

Revert "add distributed marker, rull all distributed tests on same wo…

29c2738

…rker" This reverts commit 3f55ffa.

add distributed marker, rull all distributed tests on same worker

f5b3288

Merge branch 'refactor/clustering-prereqs' into feature/clustering

c3dca4a

Merge branch 'main' into refactor/clustering-prereqs

3b116f5

Merge branch 'refactor/clustering-prereqs' into feature/clustering

2e60193

Merge branch 'feature/clustering' into feature/clustering-dashboard

4f72b11

add "num_nonsingleton_groups" metric to spd-cluster

d06ff34

uv sync

f39e251

wip jaccard

b058a2d

make format

d7c3258

refactor: use general_utils methods for getting device everywhere

deb59a8

Merge branch 'main' into refactor/clustering-prereqs

bac464c

Merge branch 'main' into feature/clustering

2e29b2f

Merge branch 'feature/clustering' into feature/clustering-dashboard

e23d6a8

wip jaccard

2eb96fc

wip jaccard

c71c696

Merge branch 'main' into refactor/clustering-prereqs

f9b228c

Merge branch 'refactor/clustering-prereqs' into feature/clustering

8a7a423

Merge branch 'feature/clustering' into feature/clustering-dashboard

89530b1

wip jaccard

a25a9a9

wip jaccard (plotting)

d23823d

found where to increase timeout

6280591

Merge branch 'refactor/clustering-prereqs' into feature/clustering

85f789b

mivanit added 14 commits October 24, 2025 12:14

wip

097b511

testing via playwright

4cf945f

wip testing framework

c28d0eb

wip

11315c8

working but not in tests?

72d8589

remove idx_in_ensemble, always auto-assigned now

40df505

see #227 (comment)

Merge branch 'clustering/main' into clustering/add-mrc-inline-in-pipe…

bd8a442

…line-cfg

wip

7339a9f

format

c93aea4

something fixed???

8e95223

!!!!!!!!!!!!!! PASSING?????

5fc21a3

claude still working on stuff

4561ed7

make playwright tests work in CI?

bc15921

fix pyright issues

3caf5e9

mivanit marked this pull request as ready for review October 24, 2025 16:48

mivanit changed the title ~~[clustering] visualization dashboard~~ [wip][clustering] visualization dashboard Oct 24, 2025

mivanit added 2 commits October 24, 2025 17:56

only allow passing clustering run config path, not inline

cf64a79

see discussion at #227 (comment) have tried to make this change as isolated as possible -- i think this was a useful feature and we may want to add it back at some point

rename run_clustering_config_path -> clustering_run_config_path

2a9f731

old name didnt make sense, since it should be a path to a file with a `ClusteringRunConfig` see #227 (comment)

mivanit and others added 8 commits October 24, 2025 18:04

remove junk from readme

59875b9

Merge branch 'clustering/main' into feature/clustering-dashboard

7bbd792

Merge branch 'clustering/add-mrc-inline-in-pipeline-cfg' into feature…

9be9929

…/clustering-dashboard

Merge branch 'clustering/main' into feature/clustering-dashboard

8df885e

properly install playwright

5c40ef6

dataset streaming for dashboard tests

2062b18

remove temp CI step, we now test this in pytest

e5ca948

Merge branch 'clustering/main' into feature/clustering-dashboard

040abd5

mivanit marked this pull request as draft October 27, 2025 11:29

Merge branch 'clustering/main' into feature/clustering-dashboard

687ef4c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip][clustering] visualization dashboard#170

[wip][clustering] visualization dashboard#170
mivanit wants to merge 705 commits intoclustering/mainfrom
feature/clustering-dashboard

mivanit commented Sep 25, 2025 •

edited

Loading

Uh oh!

miv-goodfire commented Oct 24, 2025 •

edited by mivanit

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mivanit commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Does this PR introduce a breaking change?

Uh oh!

miv-goodfire commented Oct 24, 2025 • edited by mivanit Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mivanit commented Sep 25, 2025 •

edited

Loading

miv-goodfire commented Oct 24, 2025 •

edited by mivanit

Loading