- Install docker https://docs.docker.com/engine/install/ubuntu/
- Clone repo
git clone https://github.com/DenisDiachkov/llm_instruct_ms
cd llm_instruct_ms
- Build
docker build -f dockerfile.prod -t llm_instruct_ms .
- Run
docker compose up
Access GUI by 127.0.0.1:8000
GET 127.0.0.1:8000/generate
prompt: str
system_prompt: str
reply_prefix: str
image: str - base64 format
max_new_tokens: int
temperature: float
top_p: float
repetition_penalty: float
frequency_penalty: float
presence_penalty: float
The server will respond with a JSON object containing:
status: str - "error" or "finished"
content: str - LLMs response
prompt: str
system_prompt: str
reply_prefix: str
image: str - base64 format
max_new_tokens: int
temperature: float
top_p: float
repetition_penalty: float
frequency_penalty: float
presence_penalty: float
The server will send a JSON object containing:
status: str - "error" or "finished"
content: str - LLMs streaming response
Can contain also other entries, depending on the status. See more here: asyncio_queue_manager