This is an Ilab API Server that is a temporary set of APIs for service developing apps against InstructLab. It provides endpoints for model management, data generation, training, job tracking and job logging.
- Ensure that the required directories (
base-dir
andtaxonomy-path
) exist and are accessible and Go is installed in the $PATH.
To install the necessary dependencies, run:
go mod download
go run main.go --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --osx
go run main.go --base-dir /path/to/base-dir --taxonomy-path /path/to/taxonomy --cuda
- If you're operating on a Red Hat Enterprise Linux AI (RHEL AI) machine, and the ilab binary is already available in your $PATH, you don't need to specify the --base-dir. Additionally, pass CUDA support with
--cuda
.
go run main.go --taxonomy-path ~/.local/share/instructlab/taxonomy/ --rhelai --cuda
The --rhelai
flag indicates that the ilab binary is available in the system's $PATH and does not require a virtual environment.
When using --rhelai
, the --base-dir
flag is not required since it will be in a known location at least for meow.
Here's an example command for running the server on a macOS machine with Metal support:
go run main.go --base-dir /Users/user/code/instructlab --taxonomy-path ~/.local/share/instructlab/taxonomy/ --osx
Endpoint: GET /models
Fetches the list of available models.
- Response:
[ { "name": "model-name", "last_modified": "timestamp", "size": "size-string" } ]
Endpoint: GET /data
Fetches the list of datasets.
- Response:
[ { "dataset": "dataset-name", "created_at": "timestamp", "file_size": "size-string" } ]
Endpoint: POST /data/generate
Starts a data generation job.
- Request: None
- Response:
{ "job_id": "generated-job-id" }
Endpoint: GET /jobs
Fetches the list of all jobs.
- Response:
[ { "job_id": "job-id", "status": "running/finished/failed", "cmd": "command", "branch": "branch-name", "start_time": "timestamp", "end_time": "timestamp" } ]
Endpoint: GET /jobs/{job_id}/status
Fetches the status of a specific job.
- Response:
{ "job_id": "job-id", "status": "running/finished/failed", "branch": "branch-name", "command": "command" }
Endpoint: GET /jobs/{job_id}/logs
Fetches the logs of a specific job.
- Response: Text logs of the job.
Endpoint: POST /model/train
Starts a training job.
-
Request:
{ "modelName": "name-of-the-model", "branchName": "name-of-the-branch" }
Note: The
modelName
can be provided with or without themodels/
prefix. Examples:- Without prefix:
"granite-7b-lab-Q4_K_M.gguf"
- With prefix:
"models/granite-7b-starter"
The server will handle the prefix to construct the correct model path.
- Without prefix:
-
Response:
{ "job_id": "training-job-id" }
Endpoint: POST /pipeline/generate-train
Combines data generation and training into a single pipeline job.
-
Request:
{ "modelName": "name-of-the-model", "branchName": "name-of-the-branch" }
Note: Similar to the training endpoint,
modelName
can be with or without themodels/
prefix. -
Response:
{ "pipeline_job_id": "pipeline-job-id" }
Endpoint: POST /model/serve-latest
Serves the latest model checkpoint on port 8001
.
- Response:
{ "status": "model process started", "job_id": "serve-job-id" }
Endpoint: POST /model/serve-base
Serves the base model on port 8000
.
- Response:
{ "status": "model process started", "job_id": "serve-job-id" }
The server is designed to handle modelName
inputs both with and without the models/
prefix to prevent path duplication. Here’s how it works:
-
Without Prefix:
- Input:
"granite-7b-lab-Q4_K_M.gguf"
- Constructed Path:
~/.cache/instructlab/models/granite-7b-lab-Q4_K_M.gguf
- Input:
-
With Prefix:
- Input:
"models/granite-7b-starter"
- Constructed Path:
~/.cache/instructlab/models/granite-7b-starter
- Input: