Support module queries; handle NaN grads; refactor Attributor #38

luciaquirke · 2025-09-12T07:21:11Z

Cleanup/small fixes:

Perform sanity check for NaNs in index grads (shouldn't happen)
Resolve post unit norm NaNs in Attributor (sometimes happens when the grads have been converted to zero vectors by the final dtype conversion before saving to disk)
Support queries where k > n
Support querying module/s in addition to full model grads
Use a less performant default FAISS config that works on all devices. The recommended one is documented in the docstring.
Move FAISS logic into its own file to simplify Attributor
Add FAISS CLI flag to query_index
Deprecate unstructured gradient indices
Harden check for existing FAISS index
Remove redundant unit norm
Rename chunk to shard

luciaquirke · 2025-09-12T07:45:03Z

examples/query_index.py

        "--model", type=str, default="HuggingFaceTB/SmolLM2-135M-Instruct"
    )
-    parser.add_argument("--dataset", type=str, default="EleutherAI/SmolLM2-135M-10B")
+    parser.add_argument("--dataset", type=str, default="RonenEldan/TinyStories")


Smaller default dataset for casual consumers/prototyping

norabelrose

just use epsilon instead of nan_to_num and then I think it's good to go

norabelrose · 2025-09-16T03:57:31Z

bergson/attributor.py

+            for name in q:
+                q[name] /= norm
+                # Zero gradients will be NaN after normalization
+                q[name] = q[name].nan_to_num(0)


hmm I guess this works although the standard way to do this is to add epsilon, like 1e-8. It is a hyperparameter but it means the function isn't sharply discontinuous around zero

luciaquirke added 2 commits September 12, 2025 03:26

Drop support for unstructured grads

d9b6141

Extract out faiss index

ceda77a

luciaquirke force-pushed the track-steps branch from ac716af to ceda77a Compare September 12, 2025 07:22

luciaquirke requested a review from norabelrose September 12, 2025 07:24

luciaquirke added 2 commits September 12, 2025 07:27

Update query_index example

a38846f

Rename chunk to shard

785c9bd

luciaquirke changed the title ~~Extract out FAISS index from Attributor~~ Refactor Attributor; support module queries; handle NaNs Sep 12, 2025

luciaquirke changed the title ~~Refactor Attributor; support module queries; handle NaNs~~ Support module queries; handle NaNs; refactor Attributor Sep 12, 2025

luciaquirke changed the title ~~Support module queries; handle NaNs; refactor Attributor~~ Support module queries; handle NaN grads; refactor Attributor Sep 12, 2025

luciaquirke commented Sep 12, 2025

View reviewed changes

norabelrose reviewed Sep 16, 2025

View reviewed changes

use eps in norm

9f7c4b0

luciaquirke merged commit 22c0553 into main Sep 16, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support module queries; handle NaN grads; refactor Attributor #38

Support module queries; handle NaN grads; refactor Attributor #38

Uh oh!

luciaquirke commented Sep 12, 2025 •

edited

Loading

Uh oh!

luciaquirke Sep 12, 2025 •

edited

Loading

Uh oh!

norabelrose left a comment

Uh oh!

norabelrose Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support module queries; handle NaN grads; refactor Attributor #38

Support module queries; handle NaN grads; refactor Attributor #38

Uh oh!

Conversation

luciaquirke commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luciaquirke Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

norabelrose left a comment

Choose a reason for hiding this comment

Uh oh!

norabelrose Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

luciaquirke commented Sep 12, 2025 •

edited

Loading

luciaquirke Sep 12, 2025 •

edited

Loading