Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA: [Nightly - 227] #1648

Open
TC117 opened this issue Nov 7, 2024 · 4 comments
Open

QA: [Nightly - 227] #1648

TC117 opened this issue Nov 7, 2024 · 4 comments
Assignees
Labels

Comments

@TC117
Copy link

TC117 commented Nov 7, 2024

QA details:

Version: v1.0.1-227

OS (select one)

  • Windows 11 (online & offline)
  • Ubuntu 24, 22 (online & offline)
  • Mac Silicon OS 14/15 (online & offline)
  • Mac Intel (online & offline)

1. Manual QA (CLI)

Installation

  • it should install with local installer (default; no internet required during installation, all dependencies bundled)
  • it should install with network installer
  • it should install 2 binaries (cortex and cortex-server) [mac: binaries in /usr/local/bin]
  • it should install with correct folder permissions
  • it should install with folders: /engines /logs (no /models folder until model pull)
  • It should install with Docker image https://cortex.so/docs/installation/docker/

Data/Folder structures

  • cortex.so models are stored in cortex.so/model_name/variants/, with .gguf and model.yml file
  • huggingface models are stored huggingface.co/author/model_name with .gguf and model.yml file
  • downloaded models are saved in cortex.db with the right fields: model, author_repo_id, branch_name, path_to_model_yaml (view via SQL)

Cortex Update

  • cortex -v should check output current version and check for updates
  • cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp)
  • cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump)
  • cortex update should update from the previous version to latest (+1 bump)
  • cortex update -v 1.x.x-xxx should update from the previous version to specified version
  • cortex update should update from previous stable version to latest
  • it should gracefully update when server is actively running

Overall / App Shell

  • cortex returns helpful text in a timely* way (< 5s)
  • cortex or cortex -h displays help commands
  • CLI commands should start the API server, if not running [except
  • it should correctly log to cortex-cli.log and cortex.log
  • There should be no stdout from inactive shell session

Engines

  • llama.cpp should be installed by default
  • it should run gguf models on llamacpp
  • it should list engines
  • it should get engines
  • it should install engines (latest version if not specified)
  • it should install engines (with specified variant and version)
  • it should get default engine
  • it should set default engine (with specified variant/version)
  • it should load engine
  • it should unload engine
  • it should update engine (to latest version)
  • it should update engine (to specified version)
  • it should uninstall engines
  • it should gracefully continue engine installation if interrupted halfway (partial download)
  • it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx)
  • it should run trtllm models on trt-llm [WIP, not tested]
  • it shoud handle engine variants [WIP, not tested]
  • it should update engines versions [WIP, not tested]

Server

  • cortex start should start server and output localhost URL & port number
  • users can access API Swagger documentation page at localhost URL & port number
  • cortex start can be configured with parameters (port, logLevel [WIP]) https://cortex.so/docs/cli/start/
  • it should correctly log to cortex logs (logs/cortex.log, logs/cortex-cli.log)
  • cortex ps should return server status and running models (or no model loaded)
  • cortex stop should stop server

Model Pulling

  • Pulling a model should pull .gguf and model.yml file
  • Model download progress should appear as download bars for each file
  • Model download progress should be accurate (%, total time, download size, speed)

cortex.so

  • it should pull by built in model_ID
  • pull by model_ID should recommend default variant at the top (set in HF model.yml)
  • it should pull by built-in model_id:variant

huggingface.co

  • it should pull by HF repo/model ID
  • it should pull by full HF url (ending in .gguf)

Interrupted Download

  • it should allow user to interrupt / stop download
  • pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (Found unfinished download! Additional XGB needs to be downloaded)
  • it should allow to continue downloading the remainder after interruption

Model Management

  • it should list downloaded models
  • it should get a local model
  • it should update model parameters in model.yaml
  • it should delete a model
  • it should import models with model_id and model_path

Model Running

  • cortex run <cortexso model> - if no local models detected, shows pull model menu
  • cortex run - if local model detected, runs the local model
  • cortex run - if multiple local models detected, shows list of local models (from multiple model sources eg cortexso, HF authors) for users to select (via regex search)
  • cortex run <invalid model id> should return gracefully Model not found!
  • run should autostart server
  • cortex run <model> starts interactive chat (by default)
  • cortex run <model> -d runs in detached mode
  • cortex models start <model>
  • terminate StdIn or exit() should exit interactive chat

Hardware Detection / Acceleration [WIP, no need to QA]

  • it should auto offload max ngl
  • it should correctly detect available GPUs
  • it should gracefully detect missing dependencies/drivers
    CPU Extension (e.g. AVX-2, noAVX, AVX-512)
    GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)

Uninstallation / Reinstallation

  • it should uninstall 2 binaries (cortex and cortex-server)
  • it should uninstall with 2 options to delete or not delete data folder
  • it should gracefully uninstall when server is still running
  • uninstalling should not leave any dangling files
  • uninstalling should not leave any dangling processes
  • it should reinstall without having conflict issues with existing cortex data folders

--

2. API QA

Checklist for each endpoint

  • Upon cortex start, API page is displayed at localhost:port endpoint
  • Endpoints should support the parameters stated in API reference (towards OpenAI Compatibility)
  • https://cortex.so/api-reference is updated

Endpoints

Chat Completions

Engines

  • List engines: GET /v1/engines
  • Get engine: GET /v1/engines/{name}
  • Install engine: POST /v1/engines/install/{name}
  • Get default engine variant/version: GET v1/engines/{name}/default
  • Set default engine variant/version: POST v1/engines/{name}/default
  • Load engine: POST v1/engines/{name}/load
  • Unload engine: DELETE v1/engines/{name}/load
  • Update engine: POST v1/engines/{name}/update
  • uninstall engine: DELETE /v1/engines/install/{name}

Pulling Models

  • Pull model: POST /v1/models/pull starts download (websockets)
  • Pull model: websockets /events emitted
  • Stop model download: DELETE /v1/models/pull (websockets)
  • Stop model download: websockets /events stopped
  • Import model: POST v1/models/import

Running Models

  • List models: GET v1/models
  • Start model: POST /v1/models/start
  • Stop model: POST /v1/models/stop
  • Get model: GET /v1/models/{id}
  • Delete model: DELETE /v1/models/{id}
  • Update model: PATCH /v1/models/{model} updates model.yaml params

Server

  • CORs [WIP]
  • health: GET /healthz
  • terminate server: DELETE /processManager/destroy

Test list for reference:

@TC117 TC117 added the type: QA checklist QA checklist label Nov 7, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Jan & Cortex Nov 7, 2024
@TC117
Copy link
Author

TC117 commented Nov 7, 2024

Should not allow delete model / engine while running
same action API
image

@TC117
Copy link
Author

TC117 commented Nov 7, 2024

  • API doc should generate a json file for API collection
    image

@TC117
Copy link
Author

TC117 commented Nov 7, 2024

does the /engines/:name/ load still work ???
image

@TC117
Copy link
Author

TC117 commented Nov 7, 2024

On CLI, we dont have engines load but on /chat/completions show message "Engine is not loaded yet"
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Investigating
Development

No branches or pull requests

1 participant