Skip to content

Conversation

@FlorianJoncour
Copy link
Contributor

This is a reset of #2210.

The final goal is to implement function calls using the OpenAI API.
But since it was likely too much all at once, we will do it in two parts.

This pull request is only a refactoring/relocation of code to separate the Uvicorn server, the chat, and the completions.
The chat and completions are now in separate classes.
The goal is to make the entire codebase clearer and more easily modifiable in the future, as the completion should now be considered legacy.

The chat part has been divided into several methods, while the completion remained largely unchanged except for being encapsulated within a class.

Tested chat and completions with and without stream mode.

@simon-mo simon-mo self-assigned this Jan 8, 2024
Copy link
Contributor

@NikolaBorisov NikolaBorisov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks ok.

Copy link
Collaborator

@simon-mo simon-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two minor points

return StreamingResponse(fake_stream_generator(),
generator = await openai_serving_completion.create_completion(
request, raw_request)
logger.info("TYPE COMPLETION : %s" % str(type(generator)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.info("TYPE COMPLETION : %s" % str(type(generator)))

engine_model_config.tokenizer,
tokenizer_mode=engine_model_config.tokenizer_mode,
trust_remote_code=engine_model_config.trust_remote_code)
self._load_chat_template(self.chat_template)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chat template is the responsibility of ChatCompletion only

@simon-mo
Copy link
Collaborator

will fix and merge this once #2355 is in.

@FlorianJoncour
Copy link
Contributor Author

Fine.

I still made the changes ^^

@viktor-ferenczi viktor-ferenczi mentioned this pull request Jan 13, 2024
11 tasks
@simon-mo simon-mo merged commit 14cc317 into vllm-project:main Jan 17, 2024
@simon-mo
Copy link
Collaborator

@FlorianJoncour, merged! Thank you for the contribution, looking forward to the tool calling PR!

@jessiewiswjc
Copy link

@FlorianJoncour Is there a new pr of function_call?

@FlorianJoncour
Copy link
Contributor Author

I work on it, it shouldn't be too long

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Jan 18, 2024
joennlae added a commit to joennlae/vllm that referenced this pull request Jan 21, 2024
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants