Skip to content

[bug] Conversion fails when using layers_per_step with some input formats #87

@RaymondLi0

Description

@RaymondLi0

🐞 Describe the Bug

Conversion fails when using layers_per_step together with input_format=fast_llm
example job: 7ada4a96-4b5d-43de-a156-ebea5f359a33

Global counter mismatch for parameter "layers.8.norm_1.weight" and shard "weights": 0 != 2048
[...]
Global counter mismatch for parameter "layers.17.output_weights" and shard "weights": 0 != 268435456

🔄 Steps to Reproduce

Convert a model exported in fast_llm format, using the layers_per_step argument

fast-llm convert gpt \
input.path=exp_dir/export/fast_llm/20000 \
input.format=fast_llm \
output.path=exp_dir/export/mixtral/20000 \
output.format=mixtral \
use_cpu=False \
exist_ok=True \
layers_per_step=8

🎯 Expected Behavior

Conversion succeeds

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions