Skip to content

[bug] Multiple test failures when testing all models #291

@jlamypoirier

Description

@jlamypoirier

🐞 Describe the Bug

When running tests for all models in #289, I get the following failures:

FAILED tests/test_checkpoint.py::test_convert_distributed_to_huggingface[llamba]@dependency_group_2 - AssertionError: Un-handled entries after conversion: {'weights': ['layers.1.self_attn.query.weight', 'layers.1.self_attn.key_value.weigh...
FAILED tests/test_checkpoint.py::test_convert_fast_llm_to_huggingface[llamba]@dependency_group_2 - AssertionError: Un-handled entries after conversion: {'weights': ['layers.1.self_attn.query.weight', 'layers.1.self_attn.key_value.weigh...
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate[mistral-False-True-10-10-10]@dependency_group_17 - AssertionError: assert False
FAILED tests/test_gpt_generate_and_forward.py::test_export_for_generate[llamba]@dependency_group_19 - AssertionError: Un-handled entries after conversion: {'weights': ['layers.1.self_attn.query.weight', 'layers.1.self_attn.key_value.weigh...
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate[llama_mtp-False-True-10-10-10]@dependency_group_16 - AssertionError: assert False
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate[llama_mtp-True-True-10-10-10]@dependency_group_16 - AssertionError: assert False
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate_from_model[llama_mtp]@dependency_group_16 - AssertionError: assert False
FAILED tests/test_gpt_generate_and_forward.py::test_small_forward_return_hidden_states[llama_mtp]@dependency_group_16 - assert (9 - 1) == 2
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate[mixtral-True-True-10-10-10]@dependency_group_18 - AssertionError: assert False
FAILED tests/test_gpt_generate_and_forward.py::test_small_generate_from_model[mixtral]@dependency_group_18 - AssertionError: assert False
FAILED tests/test_mb_seq_first.py::test_model_dp2_sp2_df4[llamba]@dependency_group_44 - ValueError: Comparison failed (66 errors)
FAILED tests/test_seq_first.py::test_model_sp2_ce4[llamba]@dependency_group_23 - ValueError: Comparison failed (1 errors)

From this we get the issues issues:

  • Conversion looks broken for llamba
  • Comparison flaky for Global gradient: layers.0.word_embeddings_weight (threshold issue? found in test_model_pp2s1_bf4[mixtral], test_model_bf4[llamba]`)
  • Generation tests are flaky ([bug] Generate test occasionally fails #274)
  • test_small_forward_return_hidden_states[llama_mtp]:testing issue? (layer count mismatch)
  • test_model_dp2_sp2_df4[llamba]: Distributed mismatch, 66 errors (distributed ssm is broken?)

I disabled distributed and conversion tests for llamba, and generation tests in #289, we'll want to fix them and bring them back.

🔄 Steps to Reproduce

Run tests with #289 (pytest tests/ -v -n 10)

🎯 Expected Behavior

Tests pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions