feat: new vLLM backend #46

jcabrero · 2025-01-22T16:18:34Z

This PR changes the way vLLM is used as a backend. Before we used offline models, but now we found a more correct way of deploying vLLM by using the vllm openai server. This enables using all the full compatibility of features that should be extended in follow-up PRs.

jcabrero added 6 commits January 22, 2025 16:18

feat: new vLLM backend

3afbced

fix: api_model not including CompletionUsage

6110a3e

fix: reset Caddyfile to default

a727c3c

chore: added functional tests folder (not in use)

55f1561

fix: reset to original docker-compose for cpu

dda2563

fix: CompletionUsage not removed

c98b4fe

jcabrero merged commit b10db04 into main Jan 24, 2025
1 check passed

jcabrero deleted the feat/new_vllm_backend branch February 18, 2025 14:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: new vLLM backend #46

feat: new vLLM backend #46

Uh oh!

jcabrero commented Jan 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: new vLLM backend #46

feat: new vLLM backend #46

Uh oh!

Conversation

jcabrero commented Jan 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jcabrero commented Jan 22, 2025 •

edited

Loading