Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support llama.cpp directly, bypassing ollama #233

Closed
savchenko opened this issue Nov 13, 2024 · 15 comments · Fixed by #234
Closed

Support llama.cpp directly, bypassing ollama #233

savchenko opened this issue Nov 13, 2024 · 15 comments · Fixed by #234
Assignees
Labels
bug Something isn't working priority released

Comments

@savchenko
Copy link

savchenko commented Nov 13, 2024

Given the close relationship between ollama and llama.cpp, would it be possible to support llama-server?

It exposes OpenAI-compatible HTTP endpoint on localhost.

@savchenko savchenko changed the title Support llama.cpp Support llama.cpp directly, bypassing ollama Nov 13, 2024
@fmaclen
Copy link
Owner

fmaclen commented Nov 13, 2024

We recently added support for OpenAI servers, you can find the configuration in the Settings view.

Can you configure it with your llama-server and let me know if it works?

@savchenko
Copy link
Author

savchenko commented Nov 13, 2024

Tested with v0.20.1, connectivity reports as working:

image

However, models parsing fails and model can't be selected in the "Sessions" tab.

image

In llama.cpp console request is successful:

request: GET /v1/models 127.0.0.1 200

Manual curl returns:

{
  "object": "list",
  "data": [
    {
      "id": "/home/user/Qwen2.5-Coder-32B-Instruct-Q4_K_S.gguf",
      "object": "model",
      "created": 1731505790,
      "owned_by": "llamacpp",
      "meta": {
        "vocab_type": 2,
        "n_vocab": 152064,
        "n_ctx_train": 32768,
        "n_embd": 5120,
        "n_params": 32763876352,
        "size": 18778431488
      }
    }
  ]
}

EDIT

I have noticed that OpenAPI endpoint can't be saved without an API key, "refresh" button in UI is inactive unless the key field is non-empty.

image

Providing one does not make any difference though.

@fmaclen
Copy link
Owner

fmaclen commented Nov 13, 2024

Thanks for the detailed report, I'll need to take a closer look to see where it might be going wrong.

I have noticed that OpenAPI endpoint can't be saved without an API key

Yeah, since this feature was designed specifically for OpenAI it wouldn't work without an API key so that's why we made it "mandatory", but we should probably document this better.

When we connect to Ollama via the OpenAI-compatible API we just enter a random API key which gets ignored anyways.

@fmaclen fmaclen added the triage Need to investigate further label Nov 13, 2024
@savchenko
Copy link
Author

Thanks for the detailed report, I'll need to take a closer look to see where it might be going wrong.

Not a problem.

Also, I've checked the console, but there is no output at any level apart from the benign preload warnings.

"Network" tab shows 200s to the ../models/ with the same JSON payload as I have provided above.

@fmaclen fmaclen added bug Something isn't working and removed triage Need to investigate further labels Nov 14, 2024
@fmaclen fmaclen self-assigned this Nov 14, 2024
@fmaclen
Copy link
Owner

fmaclen commented Nov 14, 2024

Found the cause of the problem. Our current implementation filters out any models that don't include gpt in their name.
Therefore Qwen2.5-Coder-32B-Instruct-Q4_K_S.gguf gets filtered out.

Removing the filter makes it work:

image

This is because when we get the models from OpenAI it also sends back a list of non-LLM models that are incompatible with Hollama.

@fmaclen
Copy link
Owner

fmaclen commented Nov 15, 2024

@savchenko here's a work-in-progress demo if you want to check it out:
https://llama-cpp-llama-server-opena.hollama.pages.dev/settings

You'll need to add a "OpenAI compatible" connection type to setup your llama.cpp server.

@savchenko
Copy link
Author

@fmaclen , the interface is slightly broken. Latest Firefox ESR, v128.3.1

image

@fmaclen
Copy link
Owner

fmaclen commented Nov 15, 2024

@savchenko slightly broken is quite the understatement 😅
Just pushed a fix, if you refresh the page it should look correct in Firefox.

@savchenko
Copy link
Author

Fresh container build from 00f5862

Firefox

image

Clicking on the SL links yields no UI changes, while in the dev. console:

Uncaught (in promise) TypeError: e.servers is undefined
    Immutable 10
        r
        ce
        F
        _t
        at
        jt
        le
        rt
        rn
        ln
    <anonymous> http://localhost:4173/sessions:45
    promise callback* http://localhost:4173/sessions:44
3.BdijOe1Y.js:1:3551

Chromium

The interface works in Chromium, however attempting to query llama.cpp shows the following error:

image

I do not observe any new messages in the llama's stdout after clicking "Run" in Hollama.

@fmaclen
Copy link
Owner

fmaclen commented Nov 16, 2024

Thanks for the update.

I was able to replicate the issue you are seeing with Firefox and I'm pretty sure it's caused by some hacky code I wrote just to quickly try things out.

That being said, it works fine for me in Chromium. If you were using the most recent release of Hollama in the same browser (with the same URL/hostname) it's possible it might have conflicting settings stored in localStorage. This is something I still need to test/QA before releasing this new version.

Couple of questions, if you don't mind:

  1. What command are you running to build the container?
  2. Does the UI load correctly in Firefox from the live demo? https://llama-cpp-llama-server-opena.hollama.pages.dev/settings
  3. In Chromium, do you still get the same error Invalid strategy in an Incognito window?

@savchenko
Copy link
Author

  1. git pull && git checkout 00f5862
    docker build -t maybellama .
    docker run -p 4173:4173 maybellama
  2. Yes
  3. Yes
    image

@fmaclen
Copy link
Owner

fmaclen commented Nov 16, 2024

@savchenko thanks for the clarification.

Try to build fee51b7 which should have fixed the Invalid strategy error and the layout issues in Firefox.
There are still a handful of smaller bugs but you should be able to interact with llama-sever 🤞

@savchenko
Copy link
Author

Success!

image

Shall this be closed?

@fmaclen
Copy link
Owner

fmaclen commented Nov 17, 2024

Glad to hear it's working!

Shall this be closed?

No, the issue will be closed automatically once the feature is released.
There is still a fair amount of cleanup and testing I need to do before we can push this out.

@fmaclen
Copy link
Owner

fmaclen commented Nov 25, 2024

🎉 This issue has been resolved in version 0.22.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority released
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants