Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

feat: use llama cpp server #350

Closed
wants to merge 33 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
30042d8
feat: using llama.cpp server
sangjanai Dec 26, 2024
454ac86
feat: use llama.cpp server linux
sangjanai Dec 26, 2024
4c1a26f
fix: add patch
sangjanai Dec 26, 2024
b1fe9a9
fix: CI
sangjanai Dec 26, 2024
12971db
chore: e2e tests
sangjanai Dec 27, 2024
6859bff
fix: support stream_options
sangjanai Dec 27, 2024
b4db561
chore: verify openai api compatibility
sangjanai Dec 31, 2024
051e9d6
chore: cleanup
sangjanai Dec 31, 2024
3274bc7
Merge branch 'main' of https://github.com/janhq/cortex.llamacpp into …
sangjanai Dec 31, 2024
e6c4218
chore: enable build
sangjanai Dec 31, 2024
c4c16ed
Merge branch 'main' of github.com:janhq/cortex.llamacpp into feat/use…
sangjanai Jan 2, 2025
7aa8135
chore: update patch
sangjanai Jan 2, 2025
3b93f74
chore: e2e
sangjanai Jan 2, 2025
04d90b3
fix: build macos
sangjanai Jan 2, 2025
aefc495
chore: add docs
sangjanai Jan 2, 2025
1903170
fix: test with cortex.cpp
sangjanai Jan 2, 2025
ba7e5af
fix: pack llama-server
sangjanai Jan 13, 2025
3476fba
Merge branch 'main' into feat/use-llama-cpp-server
vansangpfiev Jan 13, 2025
31e308d
fix: patch
sangjanai Jan 13, 2025
e83daea
fix: cuda dll search path for child process
sangjanai Jan 15, 2025
e86fba5
Merge branch 'main' into feat/use-llama-cpp-server
vansangpfiev Jan 20, 2025
921f147
Merge branch 'main' of https://github.com/janhq/cortex.llamacpp into …
sangjanai Feb 3, 2025
9f3db4c
fix: patch
sangjanai Feb 3, 2025
dc5218c
fix: return models info
sangjanai Feb 5, 2025
4d97d98
fix: get ram, vram, size
sangjanai Feb 5, 2025
b4315c4
chore: update workflows
sangjanai Feb 5, 2025
045b589
test: pack server (#393)
vansangpfiev Feb 5, 2025
300ad5c
chore: update nightly-build
sangjanai Feb 5, 2025
7bc6499
chore: update nightly-build
sangjanai Feb 5, 2025
dc2ff64
fix: ignore empty parameter
sangjanai Feb 5, 2025
6ef1682
Merge branch 'feat/use-llama-cpp-server' of https://github.com/janhq/…
sangjanai Feb 5, 2025
565cb58
fix: terminate nix process
sangjanai Feb 5, 2025
4a0e548
fix: add permission
sangjanai Feb 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file modified .github/scripts/e2e-test-server-linux-and-mac.sh
100644 → 100755
Empty file.
18 changes: 2 additions & 16 deletions .github/scripts/e2e-test-server-windows.bat
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ echo BINARY_NAME=%BINARY_NAME%

del %TEMP%\response1.log 2>nul
del %TEMP%\response2.log 2>nul
del %TEMP%\response3.log 2>nul
del %TEMP%\response4.log 2>nul
del %TEMP%\response5.log 2>nul
del %TEMP%\response6.log 2>nul
Expand Down Expand Up @@ -65,18 +64,18 @@ call set "MODEL_LLM_PATH_STRING=%%MODEL_LLM_PATH:\=\\%%"
call set "MODEL_EMBEDDING_PATH_STRING=%%MODEL_EMBEDDING_PATH:\=\\%%"
set "curl_data1={\"llama_model_path\":\"%MODEL_LLM_PATH_STRING%\"}"
set "curl_data2={\"messages\":[{\"content\":\"Hello there\",\"role\":\"assistant\"},{\"content\":\"Write a long and sad story for me\",\"role\":\"user\"}],\"stream\":false,\"model\":\"testllm\",\"max_tokens\":50,\"stop\":[\"hello\"],\"frequency_penalty\":0,\"presence_penalty\":0,\"temperature\":0.1}"
set "curl_data3={\"llama_model_path\":\"%MODEL_LLM_PATH_STRING%\"}"
set "curl_data4={\"llama_model_path\":\"%MODEL_EMBEDDING_PATH_STRING%\", \"embedding\": true, \"model_type\": \"embedding\"}"
set "curl_data5={}"
set "curl_data6={\"input\": \"Hello\", \"model\": \"test-embedding\", \"encoding_format\": \"float\"}"
@REM set "curl_data7={\"model\": \"test-embedding\"}"

rem Print the values of curl_data for debugging
echo curl_data1=%curl_data1%
echo curl_data2=%curl_data2%
echo curl_data3=%curl_data3%
echo curl_data4=%curl_data4%
echo curl_data5=%curl_data5%
echo curl_data6=%curl_data6%
@REM echo curl_data7=%curl_data7%

rem Run the curl commands and capture the status code
curl.exe --connect-timeout 60 -o "%TEMP%\response1.log" -s -w "%%{http_code}" --location "http://127.0.0.1:%PORT%/loadmodel" --header "Content-Type: application/json" --data "%curl_data1%" > %TEMP%\response1.log 2>&1
Expand All @@ -85,8 +84,6 @@ curl.exe --connect-timeout 60 -o "%TEMP%\response2.log" -s -w "%%{http_code}" --
--header "Content-Type: application/json" ^
--data "%curl_data2%" > %TEMP%\response2.log 2>&1

curl.exe --connect-timeout 60 -o "%TEMP%\response3.log" -s -w "%%{http_code}" --location "http://127.0.0.1:%PORT%/unloadmodel" --header "Content-Type: application/json" --data "%curl_data3%" > %TEMP%\response3.log 2>&1

curl.exe --connect-timeout 60 -o "%TEMP%\response4.log" --request POST -s -w "%%{http_code}" --location "http://127.0.0.1:%PORT%/loadmodel" --header "Content-Type: application/json" --data "%curl_data4%" > %TEMP%\response4.log 2>&1

curl.exe --connect-timeout 60 -o "%TEMP%\response5.log" --request GET -s -w "%%{http_code}" --location "http://127.0.0.1:%PORT%/models" --header "Content-Type: application/json" --data "%curl_data5%" > %TEMP%\response5.log 2>&1
Expand All @@ -100,7 +97,6 @@ set "error_occurred=0"
rem Read the status codes from the log files
for /f %%a in (%TEMP%\response1.log) do set "response1=%%a"
for /f %%a in (%TEMP%\response2.log) do set "response2=%%a"
for /f %%a in (%TEMP%\response3.log) do set "response3=%%a"
for /f %%a in (%TEMP%\response4.log) do set "response4=%%a"
for /f %%a in (%TEMP%\response5.log) do set "response5=%%a"
for /f %%a in (%TEMP%\response6.log) do set "response6=%%a"
Expand All @@ -117,12 +113,6 @@ if "%response2%" neq "200" (
set "error_occurred=1"
)

if "%response3%" neq "200" (
echo The third curl command failed with status code: %response3%
type %TEMP%\response3.log
set "error_occurred=1"
)

if "%response4%" neq "200" (
echo The fourth curl command failed with status code: %response4%
type %TEMP%\response4.log
Expand Down Expand Up @@ -158,10 +148,6 @@ echo ----------------------
echo Log run test:
type %TEMP%\response2.log

echo ----------------------
echo Log unload model:
type %TEMP%\response3.log

echo ----------------------
echo Log load embedding model:
type %TEMP%\response4.log
Expand Down
60 changes: 30 additions & 30 deletions .github/workflows/build.yml

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion .github/workflows/convert-model-all-quant.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ jobs:
- name: Apply patch file
run: |
cd llama.cpp
git apply ../patches/0001-Add-API-query-buffer-size.patch
git apply ../patches/0002-Build-llama-cpp-examples.patch

- name: Set up Python
uses: actions/setup-python@v5 # v5.1.1
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/create-pr-sync-remote.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,4 +47,4 @@ jobs:
- name: Apply patch file
run: |
cd llama.cpp
git apply ../patches/0001-Add-API-query-buffer-size.patch
git apply ../patches/0002-Build-llama-cpp-examples.patch
97 changes: 63 additions & 34 deletions .github/workflows/nightly-build.yml

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions .github/workflows/nightly.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ jobs:
- name: Apply patch file
run: |
cd llama.cpp
git apply ../patches/0001-Add-API-query-buffer-size.patch
git apply ../patches/0002-Build-llama-cpp-examples.patch

- name: Wait for CI to pass
env:
Expand Down Expand Up @@ -133,7 +133,7 @@ jobs:
- name: Apply patch file
run: |
cd llama.cpp
git apply ../patches/0001-Add-API-query-buffer-size.patch
git apply ../patches/0002-Build-llama-cpp-examples.patch

- name: Configure Git
run: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/template-e2e-weekend-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ jobs:
- name: Apply patch file
run: |
cd llama.cpp
git apply ../patches/0001-Add-API-query-buffer-size.patch
git apply ../patches/0002-Build-llama-cpp-examples.patch

- name: Set up Python
uses: actions/setup-python@v5 # v5.1.1
Expand Down
Loading
Loading