Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Note that Metal support is available (experimental)
  • Loading branch information
zeux authored Apr 13, 2024
1 parent e02f9be commit 3731786
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,5 +101,5 @@ RTX 4090 has a peak bandwidth of ~1008 GB/s, however it's unclear if a peak high
calm uses [🤗 Safetensors](https://huggingface.co/docs/safetensors/index) to store model files. Note that the models require conversion (see below), because calm stores model hyperparameters in .safetensors metadata and may expect a particular set of tensor names or weight order within tensors that is not always compatible with the source. Tokenizer data is stored as tensors inside the model file as well.

[^1]: CUDA runtime and compiler is used for GPU acceleration, but no CUDA or C libraries are used. Python conversion scripts use safetensors and torch, see `tools/requirements.txt`.
[^2]: Linux is the main supported OS at the moment; calm also works on macOS (on CPU) but does not support Metal.
[^2]: Linux is the main supported OS at the moment; calm also works on macOS (on CPU) and has experimental Metal support.
[^3]: Based on testing a specific Gigabyte GeForce RTX 4090 where both individual kernels from this repository and cuBLAS peak at about ~955 GB/s.

0 comments on commit 3731786

Please sign in to comment.