Replies: 1 comment
-
@fblissjr Perhaps you can try it. https://github.com/madroidmaq/mlx-omni-server |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Been thinking a bit about an elegant way to allow for modifying system messages / system instructions in the generate functions for instruct/chat models, as well as pre-fill - essentially creating a message chain of:
System Role -> User Role -> Assistant Role -> n
. This is often a very underused but powerful way to work with single step instruct models (we do this often with Anthropic's pre-fill features in the API)Here's an example of both:
Adding only
--system-message
arg:Adding messages-file for pre-fill:
For system role, I've been doing this for quite some time in my local repo for mlx-lm, and think this is probably the simplest approach:
generate.py
:messages
building to support it:However, to support the multi-turn pre-fill - - the best I can come up with is a messages JSON file, which could also work fantastically with the cache capabilities. Here's my initial proposed approach, but am open to simpler ones if they exist:
Welcome any thoughts here, or if I should just open a PR to move discussion there?
Beta Was this translation helpful? Give feedback.
All reactions