This repository has been archived by the owner on Sep 30, 2023. It is now read-only.
Releases: ravenscroftj/turbopilot
Releases · ravenscroftj/turbopilot
v0.2.1
What's Changed
- Temporary fix for stablecode and starcoder on mac by @ravenscroftj in #67
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- New: Turbopilot now supports full GPU offloading via CUDA and OpenCL by @ravenscroftj in #55
- New: Turbopilot now supports StableCode models by @ravenscroftj in #43
- New: Turbopilot is now interoperable with the Huggingface VSCode plugin by @ravenscroftj in #44
- New: End-user control of generation temperature and top-p via command line arguments by @ravenscroftj in #52
- New: Exposed batch size flag via cli by @ravenscroftj in #59
- New: Added debug log level and added timings to model outputs by @ravenscroftj in #54
- Fix: incorrect instructions in model docs by @aperullo in #57
- Fix: prevent crashing of model due to multiple parallel requests by using locking @ravenscroftj in #58
- Change: moved from alpine to ubuntu in Dockerfile.default by @nvtienanh in #60
- Fix: download link on MODELS.md by @c01o in #61
- Change: Implement better docker builds by @ravenscroftj in #62
New Contributors
- @aperullo made their first contribution in #57
- @nvtienanh made their first contribution in #60
- @c01o made their first contribution in #61
Full Changelog: v0.1.0...v0.2.0
v0.1.0
What's Changed
- Refactored codebase - now a single unified
turbopilot
binary that provides support forcodegen
andstarcoder
style models. - Support for
starcoder
,wizardcoder
andsantacoder
models - Support for CUDA 11 and 12
Full Changelog: v0.0.5...v0.1.0
Version 0.0.5
What's Changed
- Turbopilot now supports CUDA which significantly accelerates inference for long prompts on machines with NVIDIA cards builds by @ravenscroftj in #27
- Turbopilot now comes with prebuilt windows binaries by @ravenscroftj in #29
- Added links to download pre-converted and pre-quantized CodeGen mono models. by @virtualramblas and @CRD716 in #18 and #21
New Contributors
- @virtualramblas made their first contribution in #18
- @CRD716 made their first contribution in #21
- @ravenscroftj made their first contribution in #27
Full Changelog: v0.0.4...v0.0.5
Version 0.0.4
- Added multi-threaded server support which should prevent health checks aimed at
GET /
from failing during prediction. - Separated autocomplete lambda into a separate C++ function so that it can be bound to
/v1/completions
,/v1/engines/copilot-codex/completions
and/v1/engines/codegen/completions
- Removed
model
from completion input as required param which stops the official copilot plugin from freaking out - Integrate latest changes from upstream ggml including some fixes for ARM NEON processor
- Added Mac "universal binary" builds as part of CI
Support for fork of vscode-fauxpilot with a progress indicator is now available (PR is open upstream, please react/vote for it)vscode-fauxcode now supports progress indication
V0.0.3
- Added 350M parameter codegen model to Google Drive folder
- Added multi-arch docker images so that users can now directly run on Apple silicon and even raspberry pi 4
- Now support pre-tokenized inputs passed into the API from a Python tokenizer (thanks to @thakkarparth007 for their PR - ravenscroftj/ggml#2)
0.0.2
What's Changed
- Project now builds on Mac OS (Thanks to @Dimitrije-V for their PR ravenscroftj/ggml#1 and @dabdine for contributing some clearer Mac build instructions)
- Fix inability to load vocab.json on converting the 16B model due to encoding of the file not being set by @swanserquack in #5
- Improve performance of model by incorporating changes to GGML library from @ggerganov
New Contributors
- @swanserquack made their first contribution in #5
Full Changelog: 0.0.1...0.0.2
Version 0.0.1
- Initial release of Turbo Pilot