APPLICATION.md: Added MoE to naming conventions

Mozilla-Ocho · Apr 5, 2024 · 6918b30 · 6918b30
1 parent 131432e
commit 6918b30
Showing 1 changed file with 5 additions and 3 deletions.
diff --git a/APPLICATION.md b/APPLICATION.md
@@ -28,18 +28,19 @@ For example a model card for a llamafile should have this section that you can p
 
 ## Llamafile Naming Convention
 
-Llamafiles follow a naming convention of `<Model>-<Version>-<Parameters>-<Quantization>.llamafile`. 
+Llamafiles follow a naming convention of `<Model>-<Version>-<ExpertsCount>x<Parameters>-<Quantization>.llamafile`.
 
 The components are:
 1. **Model**: A descriptive name for the model type or architecture.
 2. **Version (Optional)**: Denotes the model version number, starting at `v1` if not specified, formatted as `v<Major>.<Minor>`.
     - Best practice to include model version number only if model has multiple versions and assume the unversioned model to be the first version and/or check the model card.
-3. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
+3. **ExpertsCount**: Indicates the number of experts found in a Mixture of Experts based model.
+4. **Parameters**: Indicates the number of parameters and their scale, represented as `<count><scale-prefix>`:
     - `T`: Trillion parameters.
     - `B`: Billion parameters.
     - `M`: Million parameters.
     - `K`: Thousand parameters.
-4. **Quantization**: This part specifies how the model parameters are quantized or compressed. The notation is influenced by the `./quantize --help` command in `llama.cpp`.
+5. **Quantization**: This part specifies how the model parameters are quantized or compressed. The notation is influenced by the `./quantize --help` command in `llama.cpp`.
    - Uncompressed formats:
      - `F16`: 16-bit floats per weight
      - `F32`: 32-bit floats per weight
@@ -51,6 +52,7 @@ The components are:
             - Even Number (0 or 2): `<model weights> = <scaling factor> * <quantised weight>`
             - Odd Number (1 or 3): `<model weights> = <offset factor> + <scaling factor> * <quantised weight>`
 
+
 ## Installing A Llamafile And Making It Accessible To Other Local Applications
 
 Llamafiles are designed to be standalone and portable, eliminating the need for a traditional installation. For optimal discovery and integration with local application scripts/programs, we recommend the following search paths: