[LLM] Provide an interface used for loading model on the Fastchat
side
#10282
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Similar to lm-sys/FastChat#2888, we can provide an example implementation for worker that extends the BaseModelWorker, which should override the following three methods:
After that, we can provide a bigdl_worker and a documentation on how to use it.
The controller and worker communicates through a set of interfaces, you can check this at here https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/base_model_worker.py#L196. Therefore, if bigdl_worker can successfully implement these interfaces, we can integrate BigDL-LLM into FastChat.
1. Why the change?
Similar to https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/mlx_worker.py#L33, we can provide a interface at BigDL side that provide the functionalities of loading models so that we can keep the code at FastChat side stable.
User will start the controller normally, but use like python3 -m fastchat.serve.bigdl_worker --model-names "bigdl-models" --model-path lmsys/vicuna-7b-v1.5 to start the worker.
2. User API changes
3. Summary of the change
Provide an interface used for loading model on the
Fastchat
side4. How to test?
.
5. New dependencies
None