Skip to content

Commit

Permalink
APPLICATION.md: Added MoE to naming conventions
Browse files Browse the repository at this point in the history
  • Loading branch information
mofosyne committed Apr 5, 2024
1 parent 131432e commit 6918b30
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions APPLICATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,18 +28,19 @@ For example a model card for a llamafile should have this section that you can p

## Llamafile Naming Convention

Llamafiles follow a naming convention of `<Model>-<Version>-<Parameters>-<Quantization>.llamafile`.
Llamafiles follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<Quantization>.llamafile`.

The components are:
1. **Model**: A descriptive name for the model type or architecture.
2. **Version (Optional)**: Denotes the model version number, starting at `v1` if not specified, formatted as `v<Major>.<Minor>`.
- Best practice to include model version number only if model has multiple versions and assume the unversioned model to be the first version and/or check the model card.
3. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
3. **ExpertsCount**: Indicates the number of experts found in a Mixture of Experts based model.
4. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
- `T`: Trillion parameters.
- `B`: Billion parameters.
- `M`: Million parameters.
- `K`: Thousand parameters.
4. **Quantization**: This part specifies how the model parameters are quantized or compressed. The notation is influenced by the `./quantize --help` command in `llama.cpp`.
5. **Quantization**: This part specifies how the model parameters are quantized or compressed. The notation is influenced by the `./quantize --help` command in `llama.cpp`.
- Uncompressed formats:
- `F16`: 16-bit floats per weight
- `F32`: 32-bit floats per weight
Expand All @@ -51,6 +52,7 @@ The components are:
- Even Number (0 or 2): `<model weights> = <scaling factor> * <quantised weight>`
- Odd Number (1 or 3): `<model weights> = <offset factor> + <scaling factor> * <quantised weight>`


## Installing A Llamafile And Making It Accessible To Other Local Applications

Llamafiles are designed to be standalone and portable, eliminating the need for a traditional installation. For optimal discovery and integration with local application scripts/programs, we recommend the following search paths:
Expand Down

0 comments on commit 6918b30

Please sign in to comment.