-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text Generate REST API schema #18
Conversation
Propose generate rest api endpoints Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
- model_version | ||
- done | ||
properties: | ||
text_output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is concatenated text output, we might still want to see the token generated for each iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Nvidia implementation, each response in returning cumulative set of tokens.
1st json
{
text_output: "Here is"
}
.
.
.
..
subsequent json response
{
text_output: "Here is the output for the prompt"
}
Should we add additional property to display token generated in current response set?
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
@yuzisun Wanted to follow up, if the current state of changes are alright? |
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
I have updated with all the recent discussed changes |
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
We should probably add the option to return log probabilities in the result. This seems to be fairly common among other APIs. This would comprise a boolean |
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
I have updated the PR to support the above items |
type: string | ||
logprobs: | ||
$ref: '#/components/schemas/Logprobs' | ||
Logprobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest change the naming to Token
as it is not just logprob
field, see https://github.com/huggingface/text-generation-inference/blob/main/docs/openapi.json#L844.
type: string | ||
model_version: | ||
type: string | ||
logprobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest change the name here, in TGI it is called details
which includes the tokens
, not sure if we should follow the same.
https://github.com/huggingface/text-generation-inference/blob/main/docs/openapi.json#L645
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OpenAI it is choices
https://platform.openai.com/docs/api-reference/chat/object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in OpenAI logprobs
is property under choices
.
Even I was not sure here. It is up for any suggestions
Current
Output -> {
text_output,
model_name,
model_version,
logprobs -> List[Token]
}
Token is followed as per TGI - https://huggingface.github.io/text-generation-inference/#/Text%20Generation%20Inference/generate_stream
Token -> {
id,
logprob,
special,
text
}
type: string | ||
finish_reason: | ||
type: string | ||
logprobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For streaming case it is a single token.
parameters: | ||
allOf: | ||
- $ref: '#/components/schemas/GenerateParameters' | ||
logprob: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be part of the GenerateParameters
?
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
type: string | ||
description: Sequences where the API will stop generating further tokens. | ||
logprob: | ||
type: boolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a description for this flag, also I think this should be the details
flag as logprob
is one of the fields on it.
type: string | ||
details: | ||
$ref: '#/components/schemas/StreamDetails' | ||
Logprobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a description for this
id: | ||
type: integer | ||
format: int32 | ||
minimum: 0 | ||
logprob: | ||
type: number | ||
format: float | ||
special: | ||
type: boolean | ||
text: | ||
type: string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make sure we have descriptions for these fields
type: object | ||
additionalProperties: {} | ||
properties: | ||
finish_reason: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish_reason should be an enum
properties: | ||
finish_reason: | ||
type: string | ||
logprobs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both finish_reason
and logprobs
should be required if details is requested.
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Signed-off-by: Gavrish Prabhu <gavrish.prabhu@nutanix.com>
Thanks @gavrishp !! Great job on getting this going with the initial version. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gavrishp, yuzisun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Propose generate rest api endpoints
Reference - https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/protocol/extension_generate.html#generate-extension