Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract model api into a separate SPM package #40

Open
pgorzelany opened this issue Dec 18, 2023 · 4 comments
Open

Extract model api into a separate SPM package #40

pgorzelany opened this issue Dec 18, 2023 · 4 comments

Comments

@pgorzelany
Copy link

First of all I just wanted to say this is an awesome project! I downloaded and played with the app and I was blown away by the quality of the chat. It feels like GPT 3.5 but is completely free and private.

Until today I was completely oblivious to the fact that there were open source models that could match chatGPT and I was even more surprised that those models follow a similar format (GGUF) and are interchangeable.

From the Readme I can see that your goal is to "make open, local, private models accessible to more people". I was thinking of extracting the part of the app, where you can download and run any GGUF model into a separate SPM package. That would most certainly boost availability since more developers could integrate it. Would you be willing to collaborate on such a project? I can help with it if I get a little guidance.

@psugihara
Copy link
Owner

psugihara commented Dec 18, 2023

Hey, thanks for checking it out! I'm glad you like the app and I'm very open to collaborating. Just did some light net stalking and it looks like you've done a lot of swift dev 🤗. Is there a specific app you want to build with it? What you describe is similar to my original intent with the NPC directory but it got hairy for a few reasons.

Here's a bit of context...

  • The architecture I use embeds the llama.cpp server executable running on localhost + a small watchdog process I wrote to kill it if the main app ever dies. The server approach makes it so I don't have to deal with details of string tokenization and prompt caching. Instead it's all done (and maintained!) here. Unfortunately this makes it incompatible with iOS. This arch has worked well for me as a solo dev, but I'm not attached to it if you wanted to pull those responsibilities into an equally performant swift package.

  • I was originally intending for NPC to be a nice package for writing AI agents in swift. I didn't end up writing any agent stuff for FreeChat, so the Agent class is kind of a placeholder that could be factored out. The main LLM dataflow is ConversationManager <-> Agent <-> LlamaServer.

  • There's a bare bones example of using llama.cpp with SwiftUI in the llama.cpp repo. I would guess that the perf with this is not quite as good as the chat server because of the prompt caching that server does, but I haven't verified yet. This example is relatively new.

Also feel free to shoot me an email to discuss.

@psugihara
Copy link
Owner

psugihara commented Dec 19, 2023

Hey @pgorzelany, I played with the llama.cpp swiftui example and I think it's actually very performant and promising.

I didn't realize they had in fact implemented all of the complicated caching and tokenizing stuff here https://github.com/ggerganov/llama.cpp/blob/master/examples/llama.swiftui/llama.cpp.swift/LibLlama.swift

I'd like to try refactoring FreeChat to use that code rather than the localhost server setup. Then we could compare perf in-context. If it's as fast, this would be a much simpler architecture and we could more easily factor it all out of FreeChat as a cross-platform swift module.

Would you want to take a swing at that effort? If so, comment or assign yourself on #42

@pgorzelany
Copy link
Author

Hey, thanks for all the info, it's really helpful! I am so new to all this AI stuff that my head spins when reading about it. I may take shot at #42 but I first need to understand what is going on and what llama.cpp is. No promises though since my time is super limited by family obligations but this stuff is really interesting :)

@psugihara
Copy link
Owner

sounds good, no pressure at all. if I end up taking a look at it myself, i'll ping you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants