Run tests for all models #289

jlamypoirier · 2025-06-05T19:35:14Z

✨ Description

Fixturize model tests
Run model tests for all models
Move to standardized model tests to its own directory. (test_generate is a bit different but still moved it there)
Fix obvious failures

🔍 Type of change

Select all that apply:

🐛 Bug fix (non-breaking change that addresses a specific issue)
🚀 New feature (non-breaking change that adds functionality)
⚠️ Breaking change (a change that could affect existing functionality)
📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
📝 Documentation change (updates documentation, including new content or typo fixes)
🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

bigximik

@jlamypoirier i am getting error then running pytest on some tests:

@pytest.mark.model_testing_group(ModelTestingGroup.convert)
    def test_run_converted_model(model_testing_config, convert_paths):
        model_ref = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            CheckpointLoadConfig(
                path=convert_paths["checkpoint"],
                format=DistributedCheckpointFormat,
                load_config=ModelConfigType.model,
            )
        )
        test_input = torch.randint(
            0, model_ref.config.fast_llm_config.base_model.vocab_size, size=(4, 100), dtype=torch.int64, device="cuda"
        )
        output_ref = model_ref(test_input)
        model_from_fast_llm = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            convert_paths["fast_llm_0"]
        )
        model_from_hf = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            CheckpointLoadConfig(
                path=convert_paths["huggingface_0"],
                format=model_testing_config.checkpoint_format,
                load_config=ModelConfigType.model,
            )
        )
        errors = []
        compare = CompareConfig()
        model_as_hf = transformers.AutoModel.from_pretrained(
            convert_paths["huggingface_0"], trust_remote_code=model_testing_config.checkpoint_format.trust_remote_code
        ).cuda()
        for name, model in zip(
            ("From state dict", "From Huggingface", "Native Huggingface"),
            (model_from_fast_llm, model_from_hf, model_as_hf),
        ):
            print(name)
            output = model(test_input)
            # TODO: Make a generic comparison util.
            compare_logged_tensor(
                {"samples": output_ref.logits, "shape": output_ref.logits.shape, "step": 0},
>               {"samples": output.logits, "shape": output.logits.shape, "step": 0},
                            ^^^^^^^^^^^^^
                errors,
                name,
                "logits",
                compare,
            )
E           AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'

tests/models/test_checkpoint.py:457: AttributeError

other failed tests:

FAILED tests/models/test_match_megatron.py::test_megatron[gpt2]@dependency_group_28 - RuntimeError: Process failed with return code 1
FAILED tests/models/test_match_megatron.py::test_megatron[llama]@dependency_group_31 - RuntimeError: Process failed with return code 1
FAILED tests/test_functional.py::test_dropless_mlp - Failed: Test fails, aborting to avoid breaking cuda
FAILED tests/models/test_simple.py::test_model[llamba]@dependency_group_25 - ValueError: Comparison failed (1 errors)
FAILED tests/models/test_checkpoint.py::test_run_converted_model[starcoder2]@dependency_group_1 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[llama]@dependency_group_0 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[llama_mtp]@dependency_group_2 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[qwen2]@dependency_group_3 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[mistral]@dependency_group_4 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'

Megatron seems due to the differences in my dev environment but others look legit?

jlamypoirier · 2025-06-18T16:24:45Z

@bigximik These are known failures not directly related to this PR.

test_dropless_mlp is fixed in Fix dropless MOE, add tests for sparse linear, #310.
test_model[llamba]is an instance of [bug] Model comparison tests are flaky WRT word_embeddings_weight gradients #304, no fix yet but that's my next priority.
test_run_converted_model is fixed in Fix checkpoint export errors for the Dream model #311.

Test all models

cb86f45

jlamypoirier changed the title ~~Test all models~~ Run tests for all models Jun 5, 2025

jlamypoirier added 3 commits June 6, 2025 14:41

Parametrized dependencies

f8850e4

fixes

478ac05

stuff

d3b18a1

jlamypoirier mentioned this pull request Jun 9, 2025

[bug] Multiple test failures when testing all models #291

Closed

jlamypoirier added 24 commits June 9, 2025 14:17

fix

8c64f03

fixes

c0f648c

stuff

e92c311

stuff

b877fb2

attempt

907aef0

attempt

1340903

Cleanup tests

8aed0a3

fixes

830a380

fix

13e1da5

Merge remote-tracking branch 'origin/main' into update_base_image

aa0e821

Merge remote-tracking branch 'origin/main' into update_base_image

45bb0ff

Merge branch 'update_base_image' into test_all_models

c467b63

fixes

0dffe5c

fixes

dcc5064

fixes

9d415bc

Merge branch 'update_base_image' into test_all_models

a6cce17

fixes

68251c2

Merge remote-tracking branch 'origin/main' into test_all_models

68333ef

doc

639d6c2

stuff

7465428

stuff

ced34e0

stuff

b328f07

stuff

7ed804b

Merge branch 'improve_testing' into test_all_models

890ad75

jlamypoirier requested review from RaymondLi0, bigximik, nitsanluke and oleksost June 13, 2025 21:38

jlamypoirier added 16 commits June 16, 2025 15:45

Merge remote-tracking branch 'origin/main' into improve_testing

d61445a

Merge branch 'improve_testing' into model_testing_configs

9c5883e

Merge branch 'model_testing_configs' into cleanup_tests

5a928f0

Merge branch 'cleanup_tests' into test_all_models

7dc7f53

fixes

d164f25

Merge branch 'main' into improve_testing

0889d2f

Merge branch 'improve_testing' into model_testing_configs

006e1ff

Merge remote-tracking branch 'origin/main' into model_testing_configs

8dc3abe

Merge branch 'model_testing_configs' into cleanup_tests

7a04c6a

Merge remote-tracking branch 'origin/main' into cleanup_tests

7eb4c5d

Merge branch 'cleanup_tests' into test_all_models

645eeb1

fix

9179127

fix

d97e4c1

Merge branch 'main' into cleanup_tests

c4a34f0

Merge branch 'cleanup_tests' into test_all_models

bdf37ca

fix

eb734bd

bigximik reviewed Jun 18, 2025

View reviewed changes

bigximik approved these changes Jun 19, 2025

View reviewed changes

jlamypoirier added 3 commits June 19, 2025 15:51

Merge branch 'main' into test_all_models

4f74237

Merge branch 'main' into cleanup_tests

cc806ef

Merge branch 'cleanup_tests' into test_all_models

9cba39b

Base automatically changed from cleanup_tests to main June 19, 2025 20:39

Merge remote-tracking branch 'origin/main' into test_all_models

f906ca0

jlamypoirier merged commit 1fcc369 into main Jun 19, 2025
1 of 2 checks passed

jlamypoirier deleted the test_all_models branch June 19, 2025 20:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Run tests for all models #289

Run tests for all models #289

Uh oh!

jlamypoirier commented Jun 5, 2025 •

edited

Loading

Uh oh!

bigximik left a comment

Uh oh!

jlamypoirier commented Jun 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Run tests for all models #289

Run tests for all models #289

Uh oh!

Conversation

jlamypoirier commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✨ Description

🔍 Type of change

Uh oh!

bigximik left a comment

Choose a reason for hiding this comment

Uh oh!

jlamypoirier commented Jun 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jlamypoirier commented Jun 5, 2025 •

edited

Loading