Open
Conversation
Added two documentation files to help AI assistants work effectively with the SPD codebase: - CLAUDE_COMPREHENSIVE.md: Complete reference guide covering development philosophy, coding standards, architecture patterns, workflows, and collaboration practices - CLAUDE_CHECKLIST.md: Pre-submission checklist for verifying code changes meet SPD standards before committing These documents ensure consistent code quality and help future AI assistants understand project conventions, reducing onboarding time and maintaining codebase consistency. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Added two checklist items to prevent future AI assistants from forgetting important steps: - "Checked existing patterns" item to ensure new files follow existing conventions - "Restarted checklist after any changes" with explicit STOP instruction to prevent incomplete verification Also fixed references from "dev branch" to "main branch" throughout both documentation files, as the repository uses main as the primary development branch. These changes address feedback from PR review process where these steps were accidentally omitted. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Implements a new metric that computes the cross-entropy difference between adversarially-optimized PGD-masked model outputs and the target model. The metric uses Projected Gradient Descent (PGD) to find adversarial masks that maximize CE loss against true labels, then reports the CE difference from the target model. This complements existing PGD metrics (MSE, KL) and follows the same pattern as CEandKLLosses for computing CE differences. The metric is exported in spd/metrics/__init__.py for use in experiments.
Contributor
Author
|
@claude Please review this PR (specifically the |
Contributor
|
I'll analyze this and get back to you. |
- Add PGDCEDiffConfig to configs.py - Add to EvalOnlyMetricConfigType union - Create test config with PGD CE diff metric and reduced steps
- Import PGDCEDiff and PGDCEDiffConfig - Add case to init_metric for PGDCEDiff instantiation
Add check for 3D output shape (batch, seq_len, vocab) and return zero if not applicable. This allows the metric to be included in configs but skip computation for non-LM tasks like ResidualMLP.
df33975 to
30d59fc
Compare
danbraunai
suggested changes
Nov 25, 2025
Contributor
danbraunai
left a comment
There was a problem hiding this comment.
Misc comments:
- This PR has CLAUDE_CHECKLIST.md and CLAUDE_COMPREHENSIVE.md files in it, we'd want to remove those if merging
- This PR implements the full-layer version of the loss. We'd want to confirm that we only want this and not also want the subset and layerwise versions of it.
- Not sure if we actually want this as a Metric that we can use to train on. If we don't want it as a metric, we can probably make it simpler. We may even want to put this the existing ce_and_kl_losses.py file, although I'd make sure that this won't massively slow down that calculation. If it did slow it down, then having it as a separate file that is optional to run makes sense (I suppose maybe we'd want it as a proper "metric" if doing that). Also be mindful if going the ce_and_kl_losses.py route that doesn't reduce over ranks, which is problematic for shared_over_batch.
- I think we can make minor modifications to the existing pgd functions in pgd_utils.py so that they can return CE or KL. Then we could just call those functions instead of rewriting a lot of the stuff here.
So I think we'll need to chat with Lucius who suggested this feature about some more concrete specs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR adds a new metric
PGDCEDiffthat computes the cross-entropy (CE) difference between adversarially-optimized PGD-masked model outputs andthe target model.
Changes:
spd/metrics/pgd_ce_diff.py- Implements thePGDCEDiffmetric classspd/configs.py- AddsPGDCEDiffConfigand registers it inEvalOnlyMetricConfigTypespd/eval.py- Adds metric initialization logic and importsspd/metrics/__init__.py- Exports the new metricKey Features:
ce_difference_pgd_maskedrepresenting:CE(pgd_masked_output) - CE(target_output)PGDReconLoss,PGDReconLayerwiseLoss)CEandKLLosseswith PGD-optimized maskingConfiguration:
Add to experiment configs with standard PGD parameters:
Related Issue
N/A - New feature implementation
Motivation and Context
During SPD training, we track various metrics to understand how different masking strategies affect model performance. We already have:
CEandKLLossesThis PR fills the gap by adding a PGD-optimized CE difference metric, allowing us to measure how adversarially-chosen masks (optimized to maximize
CE loss) compare to the target model. This is useful for:
How Has This Been Tested?
make type- basedpyright)make format- ruff)PGDReconLossandCEandKLLossesTest Results:
eval/ce_kl/ce_difference_pgd_masked: nan(correctly skipped non-3D output)eval/ce_kl/ce_difference_pgd_masked: 20.881893157958984(working correctly!)Note on unit tests: Following the repository's testing philosophy for research code ("Integration tests often too much overhead for research
code. Interactive use catches issues at low cost"), no unit tests were added. This is consistent with other PGD metrics in the codebase which also
lack unit tests. The metric has been validated during actual experiment runs.
Does this PR introduce a breaking change?
No breaking changes. This is purely additive - a new metric that can be optionally configured in experiment YAML files. Existing experiments
continue to work without any modifications."