You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
This should be relatively straightforward - it reads in the original ggml model, runs the quantization functions over the data, and writes it out to disk.
The exciting possibility is for parallelisation 👀 - all you should have to do is scan through the file to determine the tensor boundaries, then build an iterator from it and feed it to rayon. It would be a huge improvement over the C++ version, and it would be practically free!
The text was updated successfully, but these errors were encountered:
Is there currently a way to convert models to ggml format? Im close to getting quantize into a working demo and was wondering if this should also be ported for the PR
Hm, just do the simplest possible thing for now and we'll figure out a new CLI. There's several changes landing to the CLI soon, so we should avoid doing anything complicated until it's entirely resolved.
Split this off from #21 as it's a separate issue.
This should be relatively straightforward - it reads in the original
ggml
model, runs the quantization functions over the data, and writes it out to disk.The exciting possibility is for parallelisation 👀 - all you should have to do is scan through the file to determine the tensor boundaries, then build an iterator from it and feed it to
rayon
. It would be a huge improvement over the C++ version, and it would be practically free!The text was updated successfully, but these errors were encountered: