Skip to content

Conversation

jlamypoirier
Copy link
Collaborator

@jlamypoirier jlamypoirier commented Jun 5, 2025

✨ Description

  • Fixturize model tests
  • Run model tests for all models
  • Move to standardized model tests to its own directory. (test_generate is a bit different but still moved it there)
  • Fix obvious failures

🔍 Type of change

Select all that apply:

  • 🐛 Bug fix (non-breaking change that addresses a specific issue)
  • 🚀 New feature (non-breaking change that adds functionality)
  • ⚠️ Breaking change (a change that could affect existing functionality)
  • 📈 Performance improvement/optimization (improves speed, memory usage, or efficiency)
  • 🛠️ Code refactor (non-functional changes that improve code readability, structure, etc.)
  • 📦 Dependency bump (updates dependencies, including Dockerfile or package changes)
  • 📝 Documentation change (updates documentation, including new content or typo fixes)
  • 🔧 Infrastructure/Build change (affects build process, CI/CD, or dependencies)

@jlamypoirier jlamypoirier changed the title Test all models Run tests for all models Jun 5, 2025
Copy link
Contributor

@bigximik bigximik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlamypoirier i am getting error then running pytest on some tests:

@pytest.mark.model_testing_group(ModelTestingGroup.convert)
    def test_run_converted_model(model_testing_config, convert_paths):
        model_ref = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            CheckpointLoadConfig(
                path=convert_paths["checkpoint"],
                format=DistributedCheckpointFormat,
                load_config=ModelConfigType.model,
            )
        )
        test_input = torch.randint(
            0, model_ref.config.fast_llm_config.base_model.vocab_size, size=(4, 100), dtype=torch.int64, device="cuda"
        )
        output_ref = model_ref(test_input)
        model_from_fast_llm = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            convert_paths["fast_llm_0"]
        )
        model_from_hf = model_testing_config.huggingface_model_for_causal_lm_class.from_pretrained(
            CheckpointLoadConfig(
                path=convert_paths["huggingface_0"],
                format=model_testing_config.checkpoint_format,
                load_config=ModelConfigType.model,
            )
        )
        errors = []
        compare = CompareConfig()
        model_as_hf = transformers.AutoModel.from_pretrained(
            convert_paths["huggingface_0"], trust_remote_code=model_testing_config.checkpoint_format.trust_remote_code
        ).cuda()
        for name, model in zip(
            ("From state dict", "From Huggingface", "Native Huggingface"),
            (model_from_fast_llm, model_from_hf, model_as_hf),
        ):
            print(name)
            output = model(test_input)
            # TODO: Make a generic comparison util.
            compare_logged_tensor(
                {"samples": output_ref.logits, "shape": output_ref.logits.shape, "step": 0},
>               {"samples": output.logits, "shape": output.logits.shape, "step": 0},
                            ^^^^^^^^^^^^^
                errors,
                name,
                "logits",
                compare,
            )
E           AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'

tests/models/test_checkpoint.py:457: AttributeError

other failed tests:

FAILED tests/models/test_match_megatron.py::test_megatron[gpt2]@dependency_group_28 - RuntimeError: Process failed with return code 1
FAILED tests/models/test_match_megatron.py::test_megatron[llama]@dependency_group_31 - RuntimeError: Process failed with return code 1
FAILED tests/test_functional.py::test_dropless_mlp - Failed: Test fails, aborting to avoid breaking cuda
FAILED tests/models/test_simple.py::test_model[llamba]@dependency_group_25 - ValueError: Comparison failed (1 errors)
FAILED tests/models/test_checkpoint.py::test_run_converted_model[starcoder2]@dependency_group_1 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[llama]@dependency_group_0 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[llama_mtp]@dependency_group_2 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[qwen2]@dependency_group_3 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'
FAILED tests/models/test_checkpoint.py::test_run_converted_model[mistral]@dependency_group_4 - AttributeError: 'BaseModelOutputWithPast' object has no attribute 'logits'

Megatron seems due to the differences in my dev environment but others look legit?

@jlamypoirier
Copy link
Collaborator Author

@bigximik These are known failures not directly related to this PR.

Base automatically changed from cleanup_tests to main June 19, 2025 20:39
@jlamypoirier jlamypoirier merged commit 1fcc369 into main Jun 19, 2025
1 of 2 checks passed
@jlamypoirier jlamypoirier deleted the test_all_models branch June 19, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants