[bug] Conversion fails when using `layers_per_step` with some input formats

# 🐞 Describe the Bug

Conversion fails when using `layers_per_step` together with `input_format=fast_llm`
example job: `7ada4a96-4b5d-43de-a156-ebea5f359a33`

```
Global counter mismatch for parameter "layers.8.norm_1.weight" and shard "weights": 0 != 2048
[...]
Global counter mismatch for parameter "layers.17.output_weights" and shard "weights": 0 != 268435456
```

# 🔄 Steps to Reproduce

Convert a model exported in `fast_llm` format, using the `layers_per_step` argument

```bash
fast-llm convert gpt \
input.path=exp_dir/export/fast_llm/20000 \
input.format=fast_llm \
output.path=exp_dir/export/mixtral/20000 \
output.format=mixtral \
use_cpu=False \
exist_ok=True \
layers_per_step=8
```



# 🎯 Expected Behavior

Conversion succeeds


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bug] Conversion fails when using `layers_per_step` with some input formats #87

🐞 Describe the Bug

🔄 Steps to Reproduce

🎯 Expected Behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bug] Conversion fails when using layers_per_step with some input formats #87

Description

🐞 Describe the Bug

🔄 Steps to Reproduce

🎯 Expected Behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[bug] Conversion fails when using `layers_per_step` with some input formats #87