Microsoft Windows [version 10.0.19045.4894] (c) Microsoft Corporation. Tous droits réservés. E:\LocalAI>git clone https://github.com/go-skynet/LocalAI Cloning into 'LocalAI'... remote: Enumerating objects: 18224, done. remote: Counting objects: 100% (2050/2050), done. remote: Compressing objects: 100% (241/241), done. remote: Total 18224 (delta 1894), reused 1882 (delta 1806), pack-reused 16174 (from 1) Receiving objects: 100% (18224/18224), 8.62 MiB | 26.60 MiB/s, done. Resolving deltas: 100% (12066/12066), done. Updating files: 100% (900/900), done. E:\LocalAI>cd LocalAI/examples/telegram-bot E:\LocalAI\LocalAI\examples\telegram-bot>git clone https://github.com/mudler/chatgpt_telegram_bot Cloning into 'chatgpt_telegram_bot'... remote: Enumerating objects: 469, done. remote: Counting objects: 100% (22/22), done. remote: Compressing objects: 100% (16/16), done. remote: Total 469 (delta 9), reused 14 (delta 6), pack-reused 447 (from 1) Receiving objects: 100% (469/469), 1.04 MiB | 12.40 MiB/s, done. Resolving deltas: 100% (261/261), done. E:\LocalAI\LocalAI\examples\telegram-bot>copy docker-compose.yml chatgpt_telegram_bot\ Remplacer chatgpt_telegram_bot\docker-compose.yml (Oui/Non/Tous) : Oui 1 fichier(s) copié(s). E:\LocalAI\LocalAI\examples\telegram-bot>move config\config.example.yml config\config.yml Le fichier spécifié est introuvable. E:\LocalAI\LocalAI\examples\telegram-bot>cd chatgpt_telegram_bot E:\LocalAI\LocalAI\examples\telegram-bot\chatgpt_telegram_bot> E:\LocalAI\LocalAI\examples\telegram-bot\chatgpt_telegram_bot>move config\config.example.yml config\config.yml 1 fichier(s) déplacé(s). E:\LocalAI\LocalAI\examples\telegram-bot\chatgpt_telegram_bot>move config\config.example.env config\config.env 1 fichier(s) déplacé(s). E:\LocalAI\LocalAI\examples\telegram-bot\chatgpt_telegram_bot>docker-compose --env-file config\config.env up --build time="2024-09-28T00:38:23+02:00" level=warning msg="E:\\LocalAI\\LocalAI\\examples\\telegram-bot\\chatgpt_telegram_bot\\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion" [+] Running 39/1 ✔ api Pulled 3558.3s [+] Building 1361.6s (14/14) FINISHED docker:desktop-linux => [chatgpt_telegram_bot internal] load build definition from Dockerfile 1.6s => => transferring dockerfile: 559B 0.0s => [chatgpt_telegram_bot internal] load metadata for docker.io/library/python:3.11-slim 4.2s => [chatgpt_telegram_bot auth] library/python:pull token for registry-1.docker.io 0.0s => [chatgpt_telegram_bot internal] load .dockerignore 0.9s => => transferring context: 46B 0.0s => [chatgpt_telegram_bot 1/7] FROM docker.io/library/python:3.11-slim@sha256:585cf0799407efc267fe1cce318322ec26 17.0s => => resolve docker.io/library/python:3.11-slim@sha256:585cf0799407efc267fe1cce318322ec26e015ac1b3d77f2517d50bc 0.5s => => sha256:2aee89346c425641eb096f84c6895a49eef1f89ec9fafcad4e180e81505e0da5 16.00MB / 16.00MB 3.0s => => sha256:0538735e543faf28c166d043655e555cafb37c0d23ed50557673c76c76a395d8 248B / 248B 1347.4s => => sha256:4aeebc636537eab716f32d24a71cab7591b119e95b4294ed38a8981fe1722eb5 3.51MB / 3.51MB 2.8s => => sha256:302e3ee498053a7b5332ac79e8efebec16e900289fc1ecd1c754ce8fa047fcab 29.13MB / 29.13MB 2.6s => => extracting sha256:302e3ee498053a7b5332ac79e8efebec16e900289fc1ecd1c754ce8fa047fcab 1.9s => => extracting sha256:4aeebc636537eab716f32d24a71cab7591b119e95b4294ed38a8981fe1722eb5 0.8s => => extracting sha256:2aee89346c425641eb096f84c6895a49eef1f89ec9fafcad4e180e81505e0da5 2.4s => => extracting sha256:0538735e543faf28c166d043655e555cafb37c0d23ed50557673c76c76a395d8 0.6s => [chatgpt_telegram_bot internal] load build context 3.7s => => transferring context: 2.23MB 0.8s => [chatgpt_telegram_bot 2/7] RUN set -eux; apt-get update; DEBIAN_FRONTEND="noninteractive" apt-g 910.8s => [chatgpt_telegram_bot 3/7] RUN pip install -U pip && pip install -U wheel 24.5s => [chatgpt_telegram_bot 4/7] COPY ./requirements.txt /tmp/requirements.txt 3.3s => [chatgpt_telegram_bot 5/7] RUN pip install -r /tmp/requirements.txt && rm -r /tmp/requirements.txt 247.6s => [chatgpt_telegram_bot 6/7] COPY . /code 4.1s => [chatgpt_telegram_bot 7/7] WORKDIR /code 5.1s => [chatgpt_telegram_bot] exporting to image 136.2s => => exporting layers 93.7s => => exporting manifest sha256:278c6bff7ed4107e3c84c8c40a88f86600f9488e636f49e32618c9013f7eee29 1.3s => => exporting config sha256:ecc78022d24bf0ecbedcac3f0cdf2d2d6e87a70fdeb0d6b9aeb9117b81b8a721 0.4s => => exporting attestation manifest sha256:eeeaef9d294a781393abfd7d716254be4b22d764f8cd30d10813da679db4dcda 0.8s => => exporting manifest list sha256:effc550c5f5e1933db943a0fcdecfba45339500694d3911bd0fc3b03d6a79d97 0.4s => => naming to docker.io/library/chatgpt_telegram_bot-chatgpt_telegram_bot:latest 0.1s => => unpacking to docker.io/library/chatgpt_telegram_bot-chatgpt_telegram_bot:latest 38.9s => [chatgpt_telegram_bot] resolving provenance for metadata file 0.0s [+] Running 3/3 ✔ Network chatgpt_telegram_bot_default Created 1.3s ✔ Container chatgpt_telegram_bot-api-1 Created 28.4s ✔ Container chatgpt_telegram_bot Created 2.1s Attaching to chatgpt_telegram_bot, api-1 api-1 | ===> LocalAI All-in-One (AIO) container starting... api-1 | NVIDIA GPU detected via WSL2 api-1 | /aio/entrypoint.sh: line 26: nvidia-smi: command not found api-1 | NVIDIA GPU detected via WSL2, but nvidia-smi is not installed. GPU acceleration will not be available. api-1 | GPU acceleration is not enabled or supported. Defaulting to CPU. api-1 | ===> Starting LocalAI[cpu] with the following models: /aio/cpu/embeddings.yaml,/aio/cpu/rerank.yaml,/aio/cpu/text-to-speech.yaml,/aio/cpu/image-gen.yaml,/aio/cpu/text-to-text.yaml,/aio/cpu/speech-to-text.yaml,/aio/cpu/vision.yaml api-1 | @@@@@ api-1 | Skipping rebuild api-1 | @@@@@ api-1 | If you are experiencing issues with the pre-compiled builds, try setting REBUILD=true api-1 | If you are still experiencing issues with the build, try setting CMAKE_ARGS and disable the instructions set as needed: api-1 | CMAKE_ARGS="-DGGML_F16C=OFF -DGGML_AVX512=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF" api-1 | see the documentation at: https://localai.io/basics/build/index.html api-1 | Note: See also https://github.com/go-skynet/LocalAI/issues/288 api-1 | @@@@@ api-1 | CPU info: api-1 | model name : Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz api-1 | flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 xsaves md_clear flush_l1d arch_capabilities api-1 | CPU: AVX found OK api-1 | CPU: AVX2 found OK api-1 | CPU: no AVX512 found api-1 | @@@@@ api-1 | 12:01AM INF env file found, loading environment variables from file envFile=.env api-1 | 12:01AM DBG Setting logging to debug api-1 | 12:01AM INF Starting LocalAI using 8 threads, with models path: /models api-1 | 12:01AM INF LocalAI version: v2.21.1 (33b2d38dd0198d78dbc26aa020acfb6ff4c4048c) api-1 | 12:01AM DBG CPU capabilities: [3dnowprefetch abm adx aes apic arch_capabilities arch_perfmon avx avx2 bmi1 bmi2 clflush clflushopt cmov constant_tsc cpuid cx16 cx8 de erms f16c flush_l1d fma fpu fsgsbase fxsr ht hypervisor ibpb ibrs invpcid invpcid_single lahf_lm lm mca mce md_clear mmx movbe msr mtrr nopl nx pae pat pcid pclmulqdq pdcm pdpe1gb pge pni popcnt pse pse36 pti rdrand rdseed rdtscp rep_good sep smap smep ss ssbd sse sse2 sse4_1 sse4_2 ssse3 stibp syscall tsc vme xgetbv1 xsave xsavec xsaveopt xsaves xtopology] api-1 | WARNING: failed to determine nodes: open /sys/devices/system/node: no such file or directory api-1 | WARNING: failed to read int from file: open /sys/class/drm/card0/device/numa_node: no such file or directory api-1 | WARNING: failed to determine nodes: open /sys/devices/system/node: no such file or directory api-1 | WARNING: error parsing the pci address "vgem" api-1 | 12:01AM DBG GPU count: 1 api-1 | 12:01AM DBG GPU: card #0 @vgem api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/embeddings.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/rerank.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/text-to-speech.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/image-gen.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/text-to-text.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/speech-to-text.yaml api-1 | 12:01AM DBG [startup] resolved local model: /aio/cpu/vision.yaml api-1 | 12:01AM WRN [startup] failed resolving model '/usr/bin/local-ai' api-1 | 12:01AM ERR error installing models error="failed resolving model '/usr/bin/local-ai'" api-1 | 12:01AM DBG guessDefaultsFromFile: not a GGUF file api-1 | 12:01AM DBG guessDefaultsFromFile: not a GGUF file api-1 | 12:01AM DBG guessDefaultsFromFile: not a GGUF file api-1 | 12:01AM DBG guessDefaultsFromFile: template already set name=gpt-4o api-1 | 12:01AM DBG guessDefaultsFromFile: not a GGUF file api-1 | 12:01AM DBG guessDefaultsFromFile: not a GGUF file api-1 | 12:01AM DBG guessDefaultsFromFile: template already set name=gpt-4 api-1 | 12:01AM INF Preloading models from /models api-1 | 12:01AM INF Downloading "https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin" api-1 | 12:01AM DBG SHA missing for "/models/ggml-model-q4_0.bin". Skipping validation api-1 | 12:01AM INF File "/models/ggml-model-q4_0.bin" downloaded and verified api-1 | api-1 | Model name: text-embedding-ada-002 api-1 | api-1 | api-1 | api-1 | You can test this model with curl like this: api-1 | api-1 | curl http://localhost:8080/embeddings -X POST -H "Content-Type: api-1 | application/json" -d '{ "input": "Your text string goes here", "model": "text- api-1 | embedding-ada-002" }' api-1 | api-1 | api-1 | 12:01AM DBG Checking "ggml-whisper-base.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin" api-1 | 12:01AM INF Downloading /models/ggml-whisper-base.bin.partial: 72.0 MiB/141.1 MiB (51.01%) ETA: 4.8019375s api-1 | 12:01AM INF File "/models/ggml-whisper-base.bin" downloaded and verified api-1 | api-1 | Model name: whisper-1 api-1 | api-1 | api-1 | api-1 | ## example audio file api-1 | api-1 | wget --quiet --show-progress -O gb1.ogg api-1 | https://upload.wikimedia.org/wikipedia/commons/1/1f/George_W_Bush_Columbia_FINAL.ogg api-1 | api-1 | ## Send the example audio file to the transcriptions endpoint api-1 | api-1 | curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: api-1 | multipart/form-data" -F file="@$PWD/gb1.ogg" -F model="whisper-1" api-1 | api-1 | api-1 | 12:01AM DBG Checking "stablediffusion_assets/AutoencoderKL-256-256-fp16-opt.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-256-256-fp16-opt.param" api-1 | 12:01AM INF Downloading: 16.7 KiB api-1 | 12:01AM INF File "/models/stablediffusion_assets/AutoencoderKL-256-256-fp16-opt.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/AutoencoderKL-512-512-fp16-opt.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-512-512-fp16-opt.param" api-1 | 12:01AM INF File "/models/stablediffusion_assets/AutoencoderKL-512-512-fp16-opt.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/AutoencoderKL-base-fp16.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-base-fp16.param" api-1 | 12:01AM INF File "/models/stablediffusion_assets/AutoencoderKL-base-fp16.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/AutoencoderKL-encoder-512-512-fp16.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/AutoencoderKL-encoder-512-512-fp16.bin" api-1 | 12:01AM INF File "/models/stablediffusion_assets/AutoencoderKL-encoder-512-512-fp16.bin" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/AutoencoderKL-fp16.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/AutoencoderKL-fp16.bin" api-1 | 12:01AM INF Downloading /models/stablediffusion_assets/AutoencoderKL-fp16.bin.partial: 32.0 KiB/94.5 MiB (23.08%) ETA: 1m3.325541626s api-1 | 12:01AM INF File "/models/stablediffusion_assets/AutoencoderKL-fp16.bin" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/FrozenCLIPEmbedder-fp16.bin" api-1 | 12:01AM INF Downloading /models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin.partial: 32.0 KiB/235.1 MiB (30.77%) ETA: 1m1.341006801s api-1 | 12:01AM INF Downloading /models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin.partial: 230.0 MiB/235.1 MiB (38.30%) ETA: 51.987066831s api-1 | 12:01AM INF File "/models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/FrozenCLIPEmbedder-fp16.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/FrozenCLIPEmbedder-fp16.param" api-1 | 12:01AM INF Downloading: 32.0 KiB api-1 | 12:01AM INF File "/models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/log_sigmas.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://github.com/EdVince/Stable-Diffusion-NCNN/raw/main/x86/linux/assets/log_sigmas.bin" api-1 | 12:01AM INF File "/models/stablediffusion_assets/log_sigmas.bin" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/UNetModel-256-256-MHA-fp16-opt.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-256-256-MHA-fp16-opt.param" api-1 | 12:01AM INF File "/models/stablediffusion_assets/UNetModel-256-256-MHA-fp16-opt.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/UNetModel-512-512-MHA-fp16-opt.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-512-512-MHA-fp16-opt.param" api-1 | 12:01AM INF File "/models/stablediffusion_assets/UNetModel-512-512-MHA-fp16-opt.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/UNetModel-base-MHA-fp16.param" exists and matches SHA api-1 | 12:01AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-base-MHA-fp16.param" api-1 | 12:01AM INF File "/models/stablediffusion_assets/UNetModel-base-MHA-fp16.param" downloaded and verified api-1 | 12:01AM DBG Checking "stablediffusion_assets/UNetModel-MHA-fp16.bin" exists and matches SHA api-1 | 12:01AM INF Downloading "https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/UNetModel-MHA-fp16.bin" api-1 | 12:01AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 107.5 MiB/1.6 GiB (77.43%) ETA: 12.937809323s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 322.4 MiB/1.6 GiB (78.43%) ETA: 13.576186998s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 540.5 MiB/1.6 GiB (79.46%) ETA: 14.058394893s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 772.5 MiB/1.6 GiB (80.55%) ETA: 14.34220611s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 993.5 MiB/1.6 GiB (81.58%) ETA: 14.534406325s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 1.0 GiB/1.6 GiB (81.78%) ETA: 15.462649692s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 1.0 GiB/1.6 GiB (81.94%) ETA: 16.392840317s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 1.3 GiB/1.6 GiB (83.03%) ETA: 16.21903822s api-1 | 12:02AM INF Downloading /models/stablediffusion_assets/UNetModel-MHA-fp16.bin.partial: 1.5 GiB/1.6 GiB (84.10%) ETA: 15.955268672s api-1 | 12:02AM INF File "/models/stablediffusion_assets/UNetModel-MHA-fp16.bin" downloaded and verified api-1 | 12:02AM DBG Checking "stablediffusion_assets/vocab.txt" exists and matches SHA api-1 | 12:02AM INF Downloading "https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/vocab.txt" api-1 | 12:02AM INF Downloading: 32.0 KiB api-1 | 12:02AM INF File "/models/stablediffusion_assets/vocab.txt" downloaded and verified api-1 | api-1 | Model name: stablediffusion api-1 | api-1 | api-1 | api-1 | Stable Diffusion in NCNN with c++, supported txt2img and img2img api-1 | api-1 | api-1 | api-1 | curl http://localhost:8080/v1/images/generations -H "Content-Type: api-1 | application/json" -d '{ "prompt": "|", "step": 25, "size": "512x512" }' api-1 | api-1 | api-1 | 12:02AM DBG Checking "bakllava.gguf" exists and matches SHA api-1 | 12:02AM INF Downloading "https://huggingface.co/mys/ggml_bakllava-1/resolve/main/ggml-model-q4_k.gguf" api-1 | 12:02AM INF Downloading /models/bakllava.gguf.partial: 242.4 MiB/4.1 GiB (2.91%) ETA: 57m21.096697273s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 500.8 MiB/4.1 GiB (6.01%) ETA: 28m10.448412133s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 762.0 MiB/4.1 GiB (9.15%) ETA: 18m43.780646895s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 1.0 GiB/4.1 GiB (12.33%) ETA: 13m59.783664682s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 1.3 GiB/4.1 GiB (15.37%) ETA: 11m17.836629293s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 1.4 GiB/4.1 GiB (17.60%) ETA: 9m59.86376143s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 1.6 GiB/4.1 GiB (20.12%) ETA: 8m48.641002754s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 1.8 GiB/4.1 GiB (22.26%) ETA: 8m2.412685837s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 2.0 GiB/4.1 GiB (24.50%) ETA: 7m21.070905638s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 2.2 GiB/4.1 GiB (26.81%) ETA: 6m44.316424446s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 2.4 GiB/4.1 GiB (29.16%) ETA: 6m11.9704301s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 2.5 GiB/4.1 GiB (30.94%) ETA: 5m52.946535688s api-1 | 12:03AM INF Downloading /models/bakllava.gguf.partial: 2.7 GiB/4.1 GiB (32.94%) ETA: 5m32.126862067s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 2.9 GiB/4.1 GiB (35.08%) ETA: 5m11.095035528s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 3.1 GiB/4.1 GiB (37.53%) ETA: 4m48.137381094s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 3.2 GiB/4.1 GiB (39.81%) ETA: 4m29.336579432s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 3.4 GiB/4.1 GiB (42.05%) ETA: 4m12.450274388s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 3.6 GiB/4.1 GiB (44.36%) ETA: 3m56.002729408s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 3.8 GiB/4.1 GiB (46.34%) ETA: 3m43.749609067s api-1 | 12:04AM INF Downloading /models/bakllava.gguf.partial: 4.0 GiB/4.1 GiB (48.68%) ETA: 3m28.946439968s api-1 | 12:04AM DBG SHA missing for "/models/bakllava.gguf". Skipping validation api-1 | 12:04AM INF File "/models/bakllava.gguf" downloaded and verified api-1 | 12:04AM DBG Checking "bakllava-mmproj.gguf" exists and matches SHA api-1 | 12:04AM INF Downloading "https://huggingface.co/mys/ggml_bakllava-1/resolve/main/mmproj-model-f16.gguf" api-1 | 12:04AM INF Downloading /models/bakllava-mmproj.gguf.partial: 32.0 KiB/595.5 MiB (0.00%) ETA: 2298h4m9.635323008s api-1 | 12:04AM INF Downloading /models/bakllava-mmproj.gguf.partial: 253.7 MiB/595.5 MiB (21.30%) ETA: 13m40.378335619s api-1 | 12:05AM INF Downloading /models/bakllava-mmproj.gguf.partial: 512.2 MiB/595.5 MiB (43.01%) ETA: 5m0.940500833s api-1 | 12:05AM DBG SHA missing for "/models/bakllava-mmproj.gguf". Skipping validation api-1 | 12:05AM INF File "/models/bakllava-mmproj.gguf" downloaded and verified api-1 | api-1 | Model name: gpt-4o api-1 | api-1 | api-1 | api-1 | curl http://localhost:8080/v1/chat/completions -H "Content-Type: api-1 | application/json" -d '{ "model": "gpt-4-vision-preview", "messages": [{"role": api-1 | "user", "content": [{"type":"text", "text": "What is in the image?"}, api-1 | {"type": "image_url", "image_url": {"url": api-1 | "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin- api-1 | madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature- api-1 | boardwalk.jpg" }}], "temperature": 0.9}]}' api-1 | api-1 | api-1 | 12:05AM DBG Checking "voice-en-us-amy-low.tar.gz" exists and matches SHA api-1 | 12:05AM INF Downloading "https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-amy-low.tar.gz" api-1 | 12:05AM INF Downloading /models/voice-en-us-amy-low.tar.gz.partial: 32.0 KiB/55.6 MiB (0.06%) ETA: 118h10m18.53486481s api-1 | 12:05AM DBG SHA missing for "/models/voice-en-us-amy-low.tar.gz". Skipping validation api-1 | 12:05AM INF File "/models/voice-en-us-amy-low.tar.gz" downloaded and verified api-1 | 12:05AM INF File "/models/voice-en-us-amy-low.tar.gz" is an archive, uncompressing to /models api-1 | 12:05AM DBG Failed decompressing "/models/voice-en-us-amy-low.tar.gz": opening archive file: open /models/voice-en-us-amy-low.tar.gz: no such file or directory api-1 | 12:05AM ERR error downloading models error="opening archive file: open /models/voice-en-us-amy-low.tar.gz: no such file or directory" api-1 | 12:05AM DBG Checking "ggml-gpt4all-j.bin" exists and matches SHA api-1 | 12:05AM INF Downloading "https://gpt4all.io/models/ggml-gpt4all-j.bin" api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 138.7 MiB/3.5 GiB (3.84%) ETA: 2m5.170000328s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 393.6 MiB/3.5 GiB (10.90%) ETA: 1m21.730759174s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 645.2 MiB/3.5 GiB (17.87%) ETA: 1m8.935000971s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 893.4 MiB/3.5 GiB (24.75%) ETA: 1m0.815022908s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 1.1 GiB/3.5 GiB (30.07%) ETA: 58.140896236s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 1.3 GiB/3.5 GiB (36.86%) ETA: 51.384330866s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 1.5 GiB/3.5 GiB (43.87%) ETA: 44.788942588s api-1 | 12:05AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 1.8 GiB/3.5 GiB (50.26%) ETA: 39.595847346s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.0 GiB/3.5 GiB (55.58%) ETA: 35.970332788s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.1 GiB/3.5 GiB (60.20%) ETA: 33.181643662s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.3 GiB/3.5 GiB (65.79%) ETA: 28.776701433s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.5 GiB/3.5 GiB (72.06%) ETA: 23.401598329s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.7 GiB/3.5 GiB (77.50%) ETA: 18.966742975s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 2.9 GiB/3.5 GiB (82.80%) ETA: 14.61099444s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 3.1 GiB/3.5 GiB (88.43%) ETA: 9.854319324s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 3.3 GiB/3.5 GiB (93.90%) ETA: 5.219785438s api-1 | 12:06AM INF Downloading /models/ggml-gpt4all-j.bin.partial: 3.5 GiB/3.5 GiB (99.79%) ETA: 182.340369ms api-1 | 12:06AM INF File "/models/ggml-gpt4all-j.bin" downloaded and verified api-1 | 12:06AM DBG Prompt template "gpt4all-completion" written api-1 | 12:06AM DBG Prompt template "gpt4all-chat" written api-1 | 12:06AM DBG Written config file /models/gpt-3.5-turbo.yaml api-1 | 12:06AM DBG Written gallery file /models/._gallery_gpt-3.5-turbo.yaml api-1 | 12:06AM DBG Checking "stablediffusion_assets/AutoencoderKL-256-256-fp16-opt.param" exists and matches SHA api-1 | 12:06AM DBG File "/models/stablediffusion_assets/AutoencoderKL-256-256-fp16-opt.param" already exists and matches the SHA. Skipping download api-1 | 12:06AM DBG Checking "stablediffusion_assets/AutoencoderKL-512-512-fp16-opt.param" exists and matches SHA api-1 | 12:06AM DBG File "/models/stablediffusion_assets/AutoencoderKL-512-512-fp16-opt.param" already exists and matches the SHA. Skipping download api-1 | 12:06AM DBG Checking "stablediffusion_assets/AutoencoderKL-base-fp16.param" exists and matches SHA api-1 | 12:06AM DBG File "/models/stablediffusion_assets/AutoencoderKL-base-fp16.param" already exists and matches the SHA. Skipping download api-1 | 12:06AM DBG Checking "stablediffusion_assets/AutoencoderKL-encoder-512-512-fp16.bin" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/AutoencoderKL-encoder-512-512-fp16.bin" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/AutoencoderKL-fp16.bin" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/AutoencoderKL-fp16.bin" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/FrozenCLIPEmbedder-fp16.param" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/FrozenCLIPEmbedder-fp16.param" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/log_sigmas.bin" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/log_sigmas.bin" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/UNetModel-256-256-MHA-fp16-opt.param" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/UNetModel-256-256-MHA-fp16-opt.param" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/UNetModel-512-512-MHA-fp16-opt.param" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/UNetModel-512-512-MHA-fp16-opt.param" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/UNetModel-base-MHA-fp16.param" exists and matches SHA api-1 | 12:07AM DBG File "/models/stablediffusion_assets/UNetModel-base-MHA-fp16.param" already exists and matches the SHA. Skipping download api-1 | 12:07AM DBG Checking "stablediffusion_assets/UNetModel-MHA-fp16.bin" exists and matches SHA api-1 | 12:08AM DBG File "/models/stablediffusion_assets/UNetModel-MHA-fp16.bin" already exists and matches the SHA. Skipping download api-1 | 12:08AM DBG Checking "stablediffusion_assets/vocab.txt" exists and matches SHA api-1 | 12:08AM DBG File "/models/stablediffusion_assets/vocab.txt" already exists and matches the SHA. Skipping download api-1 | 12:08AM DBG Written config file /models/stablediffusion.yaml api-1 | 12:08AM DBG Written gallery file /models/._gallery_stablediffusion.yaml api-1 | 12:08AM DBG Checking "ggml-whisper-base.bin" exists and matches SHA api-1 | 12:08AM DBG File "/models/ggml-whisper-base.bin" already exists and matches the SHA. Skipping download api-1 | 12:08AM DBG Written config file /models/whisper-1.yaml api-1 | 12:08AM DBG Written gallery file /models/._gallery_whisper-1.yaml api-1 | 12:08AM DBG Model: gpt-4 (config: {PredictionOptions:{Model:huggingface://NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf Language: Translate:false N:0 TopP:0xc0009c3b58 TopK:0xc0009c3b60 Temperature:0xc0009c3b68 Maxtokens:0xc0009c3b98 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c3b90 TypicalP:0xc0009c3b88 Seed:0xc0009c3bb0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0009c3b50 Threads:0xc0009c3b48 Debug:0xc0009c3ba8 Roles:map[] Embeddings:0xc0009c3ba9 Backend: TemplateConfig:{Chat:{{.Input -}} api-1 | <|im_start|>assistant api-1 | ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}} api-1 | {{- if .FunctionCall }} api-1 | api-1 | {{- else if eq .RoleName "tool" }} api-1 | api-1 | {{- end }} api-1 | {{- if .Content}} api-1 | {{.Content }} api-1 | {{- end }} api-1 | {{- if .FunctionCall}} api-1 | {{toJson .FunctionCall}} api-1 | {{- end }} api-1 | {{- if .FunctionCall }} api-1 | api-1 | {{- else if eq .RoleName "tool" }} api-1 | api-1 | {{- end }}<|im_end|> api-1 | Completion:{{.Input}} api-1 | Edit: Functions:<|im_start|>system api-1 | You are a function calling AI model. api-1 | Here are the available tools: api-1 | api-1 | {{range .Functions}} api-1 | {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }} api-1 | {{end}} api-1 | api-1 | You should call the tools provided to you sequentially api-1 | Please use XML tags to record your reasoning and planning before you call the functions as follows: api-1 | api-1 | {step-by-step reasoning and plan in bullet points} api-1 | api-1 | For each function call return a json object with function name and arguments within XML tags as follows: api-1 | api-1 | {"arguments": , "name": } api-1 | <|im_end|> api-1 | {{.Input -}} api-1 | <|im_start|>assistant UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:true NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)(.*?) (?s)(.*?)] ReplaceFunctionResults:[{Key:(?s)^[^{\[]* Value:} {Key:(?s)[^}\]]*$ Value:} {Key:'([^']*?)' Value:_DQUOTE_${1}_DQUOTE_} {Key:\\" Value:__TEMP_QUOTE__} {Key:' Value:'} {Key:_DQUOTE_ Value:"} {Key:__TEMP_QUOTE__ Value:"} {Key:(?s).* Value:}] ReplaceLLMResult:[{Key:(?s).* Value:}] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c3b80 MirostatTAU:0xc0009c3b78 Mirostat:0xc0009c3b70 NGPULayers:0xc0009c3ba0 MMap:0xc0009c3a38 MMlock:0xc0009c3ba9 LowVRAM:0xc0009c3ba9 Grammar: StopWords:[<|im_end|> <|eot_id|> <|end_of_text|>] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c3a40 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:}) api-1 | 12:08AM DBG Model: gpt-4o (config: {PredictionOptions:{Model:bakllava.gguf Language: Translate:false N:0 TopP:0xc0009c3348 TopK:0xc0009c3350 Temperature:0xc0009c3358 Maxtokens:0xc0009c3388 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c3380 TypicalP:0xc0009c3378 Seed:0xc0009c33a0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4o F16:0xc0009c32e0 Threads:0xc0009c3338 Debug:0xc0009c3398 Roles:map[assistant:ASSISTANT: system:SYSTEM: user:USER:] Embeddings:0xc0009c3399 Backend:llama-cpp TemplateConfig:{Chat:A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions. api-1 | {{.Input}} api-1 | ASSISTANT: api-1 | ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c3370 MirostatTAU:0xc0009c3368 Mirostat:0xc0009c3360 NGPULayers:0xc0009c3390 MMap:0xc0009c32e1 MMlock:0xc0009c3399 LowVRAM:0xc0009c3399 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c32d0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj:bakllava-mmproj.gguf FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[{Filename:bakllava.gguf SHA256: URI:huggingface://mys/ggml_bakllava-1/ggml-model-q4_k.gguf} {Filename:bakllava-mmproj.gguf SHA256: URI:huggingface://mys/ggml_bakllava-1/mmproj-model-f16.gguf}] Description: Usage:curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{ api-1 | "model": "gpt-4-vision-preview", api-1 | "messages": [{"role": "user", "content": [{"type":"text", "text": "What is in the image?"}, {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg" }}], "temperature": 0.9}]}' api-1 | }) api-1 | 12:08AM DBG Model: jina-reranker-v1-base-en (config: {PredictionOptions:{Model:cross-encoder Language: Translate:false N:0 TopP:0xc0009c36b8 TopK:0xc0009c36c0 Temperature:0xc0009c36c8 Maxtokens:0xc0009c36f8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c36f0 TypicalP:0xc0009c36e8 Seed:0xc0009c3710 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:jina-reranker-v1-base-en F16:0xc0009c36b0 Threads:0xc0009c36a8 Debug:0xc0009c3708 Roles:map[] Embeddings:0xc0009c3709 Backend:rerankers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c36e0 MirostatTAU:0xc0009c36d8 Mirostat:0xc0009c36d0 NGPULayers:0xc0009c3700 MMap:0xc0009c3708 MMlock:0xc0009c3709 LowVRAM:0xc0009c3709 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c36a0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:You can test this model with curl like this: api-1 | api-1 | curl http://localhost:8080/v1/rerank \ api-1 | -H "Content-Type: application/json" \ api-1 | -d '{ api-1 | "model": "jina-reranker-v1-base-en", api-1 | "query": "Organic skincare products for sensitive skin", api-1 | "documents": [ api-1 | "Eco-friendly kitchenware for modern homes", api-1 | "Biodegradable cleaning supplies for eco-conscious consumers", api-1 | "Organic cotton baby clothes for sensitive skin", api-1 | "Natural organic skincare range for sensitive skin", api-1 | "Tech gadgets for smart homes: 2024 edition", api-1 | "Sustainable gardening tools and compost solutions", api-1 | "Sensitive skin-friendly facial cleansers and toners", api-1 | "Organic food wraps and storage solutions", api-1 | "All-natural pet food for dogs with allergies", api-1 | "Yoga mats made from recycled materials" api-1 | ], api-1 | "top_n": 3 api-1 | }' api-1 | }) api-1 | 12:08AM DBG Model: stablediffusion (config: {PredictionOptions:{Model:stablediffusion_assets Language: Translate:false N:0 TopP:0xc0009c3020 TopK:0xc0009c3028 Temperature:0xc0009c3030 Maxtokens:0xc0009c3060 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c3058 TypicalP:0xc0009c3050 Seed:0xc0009c3078 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:stablediffusion F16:0xc0009c3018 Threads:0xc0009c3010 Debug:0xc0009c3070 Roles:map[] Embeddings:0xc0009c3071 Backend:stablediffusion TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c3048 MirostatTAU:0xc0009c3040 Mirostat:0xc0009c3038 NGPULayers:0xc0009c3068 MMap:0xc0009c3070 MMlock:0xc0009c3071 LowVRAM:0xc0009c3071 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c3008 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[{Filename:stablediffusion_assets/AutoencoderKL-256-256-fp16-opt.param SHA256:18ca4b66685e21406bcf64c484b3b680b4949900415536d599cc876579c85c82 URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-256-256-fp16-opt.param} {Filename:stablediffusion_assets/AutoencoderKL-512-512-fp16-opt.param SHA256:cf45f63aacf3dbbab0f59ed92a6f2c14d9a1801314631cd3abe91e3c85639a20 URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-512-512-fp16-opt.param} {Filename:stablediffusion_assets/AutoencoderKL-base-fp16.param SHA256:0254a056dce61b0c27dc9ec1b78b53bcf55315c540f55f051eb841aa992701ba URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/AutoencoderKL-base-fp16.param} {Filename:stablediffusion_assets/AutoencoderKL-encoder-512-512-fp16.bin SHA256:ddcb79a9951b9f91e05e087739ed69da2c1c4ae30ba4168cce350b49d617c9fa URI:https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/AutoencoderKL-encoder-512-512-fp16.bin} {Filename:stablediffusion_assets/AutoencoderKL-fp16.bin SHA256:f02e71f80e70252734724bbfaed5c4ddd3a8ed7e61bb2175ff5f53099f0e35dd URI:https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/AutoencoderKL-fp16.bin} {Filename:stablediffusion_assets/FrozenCLIPEmbedder-fp16.bin SHA256:1c9a12f4e1dd1b295a388045f7f28a2352a4d70c3dc96a542189a3dd7051fdd6 URI:https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/FrozenCLIPEmbedder-fp16.bin} {Filename:stablediffusion_assets/FrozenCLIPEmbedder-fp16.param SHA256:471afbe678dd1fd3fe764ef9c6eccaccb0a7d7e601f27b462aa926b20eb368c9 URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/FrozenCLIPEmbedder-fp16.param} {Filename:stablediffusion_assets/log_sigmas.bin SHA256:a2089f8aa4c61f9c200feaec541ab3f5c94233b28deb6d5e8bcd974fa79b68ac URI:https://github.com/EdVince/Stable-Diffusion-NCNN/raw/main/x86/linux/assets/log_sigmas.bin} {Filename:stablediffusion_assets/UNetModel-256-256-MHA-fp16-opt.param SHA256:a58c380229f09491776df837b7aa7adffc0a87821dc4708b34535da2e36e3da1 URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-256-256-MHA-fp16-opt.param} {Filename:stablediffusion_assets/UNetModel-512-512-MHA-fp16-opt.param SHA256:f12034067062827bd7f43d1d21888d1f03905401acf6c6eea22be23c259636fa URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-512-512-MHA-fp16-opt.param} {Filename:stablediffusion_assets/UNetModel-base-MHA-fp16.param SHA256:696f6975de49f4325b53ce32aff81861a6d6c07cd9ce3f0aae2cc405350af38d URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/UNetModel-base-MHA-fp16.param} {Filename:stablediffusion_assets/UNetModel-MHA-fp16.bin SHA256:d618918d011bfc1f644c0f2a33bf84931bd53b28a98492b0a8ed6f3a818852c3 URI:https://github.com/EdVince/Stable-Diffusion-NCNN/releases/download/naifu/UNetModel-MHA-fp16.bin} {Filename:stablediffusion_assets/vocab.txt SHA256:e30e57b6f1e47616982ef898d8922be24e535b4fa3d0110477b3a6f02ebbae7d URI:https://raw.githubusercontent.com/EdVince/Stable-Diffusion-NCNN/main/x86/linux/assets/vocab.txt}] Description:Stable Diffusion in NCNN with c++, supported txt2img and img2img api-1 | Usage:curl http://localhost:8080/v1/images/generations \ api-1 | -H "Content-Type: application/json" \ api-1 | -d '{ api-1 | "prompt": "|", api-1 | "step": 25, api-1 | "size": "512x512" api-1 | }'}) api-1 | 12:08AM DBG Model: text-embedding-ada-002 (config: {PredictionOptions:{Model:ggml-model-q4_0.bin Language: Translate:false N:0 TopP:0xc0009c2a98 TopK:0xc0009c2aa0 Temperature:0xc0009c2aa8 Maxtokens:0xc0009c2ad8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c2ad0 TypicalP:0xc0009c2ac8 Seed:0xc0009c2af0 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:text-embedding-ada-002 F16:0xc0009c2a90 Threads:0xc0009c2a88 Debug:0xc0009c2ae8 Roles:map[] Embeddings:0xc0009c2ae9 Backend:bert-embeddings TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c2ac0 MirostatTAU:0xc0009c2ab8 Mirostat:0xc0009c2ab0 NGPULayers:0xc0009c2ae0 MMap:0xc0009c2ae8 MMlock:0xc0009c2ae9 LowVRAM:0xc0009c2ae9 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c2a80 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:You can test this model with curl like this: api-1 | api-1 | curl http://localhost:8080/embeddings -X POST -H "Content-Type: application/json" -d '{ api-1 | "input": "Your text string goes here", api-1 | "model": "text-embedding-ada-002" api-1 | }'}) api-1 | 12:08AM DBG Model: tts-1 (config: {PredictionOptions:{Model:en-us-amy-low.onnx Language: Translate:false N:0 TopP:0xc0009c3538 TopK:0xc0009c3540 Temperature:0xc0009c3548 Maxtokens:0xc0009c3578 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c3570 TypicalP:0xc0009c3568 Seed:0xc0009c3590 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:tts-1 F16:0xc0009c3530 Threads:0xc0009c3528 Debug:0xc0009c3588 Roles:map[] Embeddings:0xc0009c3589 Backend: TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c3560 MirostatTAU:0xc0009c3558 Mirostat:0xc0009c3550 NGPULayers:0xc0009c3580 MMap:0xc0009c3588 MMlock:0xc0009c3589 LowVRAM:0xc0009c3589 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c3520 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[{Filename:voice-en-us-amy-low.tar.gz SHA256: URI:https://github.com/rhasspy/piper/releases/download/v0.0.2/voice-en-us-amy-low.tar.gz}] Description: Usage:To test if this model works as expected, you can use the following curl command: api-1 | api-1 | curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{ api-1 | "model":"voice-en-us-amy-low", api-1 | "input": "Hi, this is a test." api-1 | }'}) api-1 | 12:08AM DBG Model: whisper-1 (config: {PredictionOptions:{Model:ggml-whisper-base.bin Language: Translate:false N:0 TopP:0xc0009c2ca8 TopK:0xc0009c2cb0 Temperature:0xc0009c2cb8 Maxtokens:0xc0009c2ce8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0009c2ce0 TypicalP:0xc0009c2cd8 Seed:0xc0009c2d00 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:whisper-1 F16:0xc0009c2ca0 Threads:0xc0009c2c98 Debug:0xc0009c2cf8 Roles:map[] Embeddings:0xc0009c2cf9 Backend:whisper TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0009c2cd0 MirostatTAU:0xc0009c2cc8 Mirostat:0xc0009c2cc0 NGPULayers:0xc0009c2cf0 MMap:0xc0009c2cf8 MMlock:0xc0009c2cf9 LowVRAM:0xc0009c2cf9 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0009c2c90 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[{Filename:ggml-whisper-base.bin SHA256:60ed5bc3dd14eea856493d334349b405782ddcaf0028d4b5df4088345fba2efe URI:https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.bin}] Description: Usage:## example audio file api-1 | wget --quiet --show-progress -O gb1.ogg https://upload.wikimedia.org/wikipedia/commons/1/1f/George_W_Bush_Columbia_FINAL.ogg api-1 | api-1 | ## Send the example audio file to the transcriptions endpoint api-1 | curl http://localhost:8080/v1/audio/transcriptions \ api-1 | -H "Content-Type: multipart/form-data" \ api-1 | -F file="@$PWD/gb1.ogg" -F model="whisper-1" api-1 | }) api-1 | 12:09AM DBG Extracting backend assets files to /tmp/localai/backend_data api-1 | 12:09AM DBG processing api keys runtime update api-1 | 12:09AM DBG processing external_backends.json api-1 | 12:09AM DBG external backends loaded from external_backends.json api-1 | 12:09AM INF core/startup process completed! api-1 | 12:09AM DBG No configuration file found at /tmp/localai/upload/uploadedFiles.json api-1 | 12:09AM DBG No configuration file found at /tmp/localai/config/assistants.json api-1 | 12:09AM DBG No configuration file found at /tmp/localai/config/assistantsFile.json api-1 | 12:09AM INF LocalAI API is listening! Please connect to the endpoint for API documentation. endpoint=http://0.0.0.0:8080 api-1 | 12:09AM INF Success ip=127.0.0.1 latency=2.825303ms method=GET status=200 url=/readyz chatgpt_telegram_bot | /code/bot/agent.py:3: LangChainDeprecationWarning: Importing LocalAIEmbeddings from langchain.embeddings is deprecated. Please replace deprecated imports: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain.embeddings import LocalAIEmbeddings chatgpt_telegram_bot | chatgpt_telegram_bot | with new imports of: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain_community.embeddings import LocalAIEmbeddings chatgpt_telegram_bot | You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here chatgpt_telegram_bot | from langchain.embeddings import LocalAIEmbeddings chatgpt_telegram_bot | USER_AGENT environment variable not set, consider setting it to identify your requests. chatgpt_telegram_bot | /code/bot/agent.py:19: LangChainDeprecationWarning: Importing SitemapLoader from langchain.document_loaders is deprecated. Please replace deprecated imports: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain.document_loaders import SitemapLoader chatgpt_telegram_bot | chatgpt_telegram_bot | with new imports of: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain_community.document_loaders import SitemapLoader chatgpt_telegram_bot | You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here chatgpt_telegram_bot | from langchain.document_loaders import ( chatgpt_telegram_bot | /code/bot/agent.py:33: LangChainDeprecationWarning: Importing Chroma from langchain.vectorstores is deprecated. Please replace deprecated imports: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain.vectorstores import Chroma chatgpt_telegram_bot | chatgpt_telegram_bot | with new imports of: chatgpt_telegram_bot | chatgpt_telegram_bot | >> from langchain_community.vectorstores import Chroma chatgpt_telegram_bot | You can use the langchain cli to **automatically** upgrade many imports. Please see documentation here chatgpt_telegram_bot | from langchain.vectorstores import Chroma api-1 | 12:10AM INF Success ip=127.0.0.1 latency="67.006µs" method=GET status=200 url=/readyz api-1 | 12:11AM INF Success ip=127.0.0.1 latency="12.401µs" method=GET status=200 url=/readyz //I have deleted loop lines api-1 | 10:31AM INF Success ip=127.0.0.1 latency="13.4µs" method=GET status=200 url=/readyz api-1 | 10:32AM DBG Request received: {"model":"gpt-3.5-turbo","language":"","translate":false,"n":0,"top_p":1,"top_k":null,"temperature":0.7,"max_tokens":1000,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"repeat_last_n":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"system","content":"As an advanced chatbot Assistant, your primary goal is to assist users to the best of your ability. This may involve answering questions, providing helpful information, or completing tasks based on user input. In order to effectively assist users, it is important to be detailed and thorough in your responses. Use examples and evidence to support your points and justify your recommendations or solutions. Remember to always prioritize the needs and satisfaction of the user. Your ultimate goal is to provide a helpful and enjoyable experience for the user.\nIf user asks you about programming or asks to write code do not answer his question, but be sure to advise him to switch to a special mode \\\"👩🏼‍💻 Code Assistant\\\" by sending the command /mode to chat.\n"},{"role":"user","content":"Hi"}],"functions":null,"function_call":null,"stream":false,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"backend":"","model_base_name":""} api-1 | 10:32AM DBG guessDefaultsFromFile: template already set name=gpt-3.5-turbo chatgpt_telegram_bot | Something went wrong during completion. Reason: backend not found: /tmp/localai/backend_data/backend-assets/grpc/gpt4all-j {"error":{"code":500,"message":"backend not found: /tmp/localai/backend_data/backend-assets/grpc/gpt4all-j","type":""}} 500 {'error': {'code': 500, 'message': 'backend not found: /tmp/localai/backend_data/backend-assets/grpc/gpt4all-j', 'type': ''}} api-1 | 10:32AM DBG Configuration read: &{PredictionOptions:{Model:ggml-gpt4all-j.bin Language: Translate:false N:0 TopP:0xc00184c530 TopK:0xc001444468 Temperature:0xc00184c520 Maxtokens:0xc00184c528 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014444e8 TypicalP:0xc0014444e0 Seed:0xc001444508 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-3.5-turbo F16:0xc0014444a8 Threads:0xc0014444a0 Debug:0xc00184cbc0 Roles:map[] Embeddings:0xc001444501 Backend:gpt4all-j TemplateConfig:{Chat:gpt4all-chat ChatMessage: Completion:gpt4all-completion Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014444d8 MirostatTAU:0xc0014444d0 Mirostat:0xc0014444c8 NGPULayers:0xc0014444f8 MMap:0xc001444500 MMlock:0xc001444501 LowVRAM:0xc001444501 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc001444428 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:} api-1 | 10:32AM DBG Parameters: &{PredictionOptions:{Model:ggml-gpt4all-j.bin Language: Translate:false N:0 TopP:0xc00184c530 TopK:0xc001444468 Temperature:0xc00184c520 Maxtokens:0xc00184c528 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0014444e8 TypicalP:0xc0014444e0 Seed:0xc001444508 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-3.5-turbo F16:0xc0014444a8 Threads:0xc0014444a0 Debug:0xc00184cbc0 Roles:map[] Embeddings:0xc001444501 Backend:gpt4all-j TemplateConfig:{Chat:gpt4all-chat ChatMessage: Completion:gpt4all-completion Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0014444d8 MirostatTAU:0xc0014444d0 Mirostat:0xc0014444c8 NGPULayers:0xc0014444f8 MMap:0xc001444500 MMlock:0xc001444501 LowVRAM:0xc001444501 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc001444428 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: VallE:{AudioPath:}} CUDA:false DownloadFiles:[] Description: Usage:} api-1 | 10:32AM DBG Prompt (before templating): As an advanced chatbot Assistant, your primary goal is to assist users to the best of your ability. This may involve answering questions, providing helpful information, or completing tasks based on user input. In order to effectively assist users, it is important to be detailed and thorough in your responses. Use examples and evidence to support your points and justify your recommendations or solutions. Remember to always prioritize the needs and satisfaction of the user. Your ultimate goal is to provide a helpful and enjoyable experience for the user. api-1 | If user asks you about programming or asks to write code do not answer his question, but be sure to advise him to switch to a special mode \"👩🏼‍💻 Code Assistant\" by sending the command /mode to chat. api-1 | api-1 | Hi api-1 | 10:32AM DBG Template found, input modified to: The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response. api-1 | ### Prompt: api-1 | As an advanced chatbot Assistant, your primary goal is to assist users to the best of your ability. This may involve answering questions, providing helpful information, or completing tasks based on user input. In order to effectively assist users, it is important to be detailed and thorough in your responses. Use examples and evidence to support your points and justify your recommendations or solutions. Remember to always prioritize the needs and satisfaction of the user. Your ultimate goal is to provide a helpful and enjoyable experience for the user. api-1 | If user asks you about programming or asks to write code do not answer his question, but be sure to advise him to switch to a special mode \"👩🏼‍💻 Code Assistant\" by sending the command /mode to chat. api-1 | api-1 | Hi api-1 | ### Response: api-1 | api-1 | 10:32AM DBG Prompt (after templating): The prompt below is a question to answer, a task to complete, or a conversation to respond to; decide which and write an appropriate response. api-1 | ### Prompt: api-1 | As an advanced chatbot Assistant, your primary goal is to assist users to the best of your ability. This may involve answering questions, providing helpful information, or completing tasks based on user input. In order to effectively assist users, it is important to be detailed and thorough in your responses. Use examples and evidence to support your points and justify your recommendations or solutions. Remember to always prioritize the needs and satisfaction of the user. Your ultimate goal is to provide a helpful and enjoyable experience for the user. api-1 | If user asks you about programming or asks to write code do not answer his question, but be sure to advise him to switch to a special mode \"👩🏼‍💻 Code Assistant\" by sending the command /mode to chat. api-1 | api-1 | Hi api-1 | ### Response: api-1 | api-1 | 10:32AM INF Loading model 'ggml-gpt4all-j.bin' with backend gpt4all-j api-1 | 10:32AM DBG Loading model in memory from file: /models/ggml-gpt4all-j.bin api-1 | 10:32AM DBG Loading Model ggml-gpt4all-j.bin with gRPC (file: /models/ggml-gpt4all-j.bin) (backend: gpt4all-j): {backendString:gpt4all-j model:ggml-gpt4all-j.bin threads:8 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000210d88 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false} api-1 | 10:32AM ERR Server error error="backend not found: /tmp/localai/backend_data/backend-assets/grpc/gpt4all-j" ip=172.18.0.3 latency=94.639325ms method=POST status=500 url=/v1/chat/completions api-1 | 10:32AM INF Success ip=127.0.0.1 latency="10.801µs" method=GET status=200 url=/readyz api-1 | 10:33AM INF Success ip=127.0.0.1 latency="11.8µs" method=GET status=200 url=/readyz