-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text Generate REST API schema #18
Changes from 12 commits
5f5e4b7
19973d4
725b5ec
8e729f0
e300fe3
04686fa
d977938
50efd57
058b57b
e6977a6
24d9129
642f018
5902bae
23fb361
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,236 @@ | ||
openapi: 3.1.0 | ||
info: | ||
title: Open Inference API for text generation | ||
description: Open Inference API for text generation | ||
version: 1.0.0 | ||
components: | ||
schemas: | ||
Details: | ||
type: object | ||
additionalProperties: {} | ||
properties: | ||
finish_reason: | ||
type: string | ||
logprobs: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. both |
||
$ref: '#/components/schemas/Logprobs' | ||
GenerateErrorResponse: | ||
type: object | ||
required: | ||
- error | ||
properties: | ||
error: | ||
type: string | ||
GenerateParameters: | ||
type: object | ||
additionalProperties: {} | ||
properties: | ||
temperature: | ||
type: number | ||
format: float | ||
default: 1 | ||
minimum: 0 | ||
description: What sampling temperature to use, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. | ||
top_p: | ||
type: number | ||
format: float | ||
maximum: 1 | ||
minimum: 0 | ||
description: An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. | ||
max_tokens: | ||
type: integer | ||
format: int32 | ||
default: 20 | ||
minimum: 1 | ||
description: The maximum number of tokens to generate in the completion. | ||
stop: | ||
type: array | ||
items: | ||
type: string | ||
description: Sequences where the API will stop generating further tokens. | ||
logprob: | ||
type: boolean | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a description for this flag, also I think this should be the |
||
GenerateRequest: | ||
type: object | ||
required: | ||
- text_input | ||
properties: | ||
text_input: | ||
type: string | ||
parameters: | ||
allOf: | ||
- $ref: '#/components/schemas/GenerateParameters' | ||
GenerateResponse: | ||
type: object | ||
required: | ||
- text_output | ||
- model_name | ||
properties: | ||
text_output: | ||
type: string | ||
model_name: | ||
type: string | ||
model_version: | ||
type: string | ||
details: | ||
$ref: '#/components/schemas/Details' | ||
GenerateStreamResponse: | ||
type: object | ||
required: | ||
- text_output | ||
- model_name | ||
properties: | ||
text_output: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is concatenated text output, we might still want to see the token generated for each iteration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the Nvidia implementation, each response in returning cumulative set of tokens.
Should we add additional property to display token generated in current response set? |
||
type: string | ||
model_name: | ||
type: string | ||
model_version: | ||
type: string | ||
details: | ||
$ref: '#/components/schemas/StreamDetails' | ||
Logprobs: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggest change the naming to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add a description for this |
||
type: array | ||
items: | ||
$ref: '#/components/schemas/Token' | ||
StreamDetails: | ||
type: object | ||
additionalProperties: {} | ||
properties: | ||
finish_reason: | ||
type: string | ||
token: | ||
$ref: '#/components/schemas/Token' | ||
Token: | ||
type: object | ||
required: | ||
- id | ||
- text | ||
- logprob | ||
- special | ||
properties: | ||
id: | ||
type: integer | ||
format: int32 | ||
minimum: 0 | ||
logprob: | ||
type: number | ||
format: float | ||
special: | ||
type: boolean | ||
text: | ||
type: string | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's make sure we have descriptions for these fields |
||
paths: | ||
/v2/models/${MODEL_NAME}/versions/${MODEL_VERSION}/generate: | ||
post: | ||
parameters: | ||
- name: MODEL_NAME | ||
required: true | ||
in: path | ||
schema: | ||
type: string | ||
- name: MODEL_VERSION | ||
required: true | ||
in: path | ||
schema: | ||
type: string | ||
requestBody: | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateRequest' | ||
responses: | ||
'200': | ||
yuzisun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
description: generated text | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateResponse' | ||
'422': | ||
description: Input validation error | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Input validation error | ||
'424': | ||
description: Generation Error | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Request failed during generation | ||
'429': | ||
description: Model is overloaded | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Model is overloaded | ||
'500': | ||
description: Incomplete generation | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Incomplete generation | ||
|
||
/v2/models/${MODEL_NAME}/versions/${MODEL_VERSION}/generate_stream: | ||
post: | ||
parameters: | ||
- name: MODEL_NAME | ||
required: true | ||
in: path | ||
schema: | ||
type: string | ||
- name: MODEL_VERSION | ||
required: true | ||
in: path | ||
schema: | ||
type: string | ||
requestBody: | ||
content: | ||
application/json: | ||
schema: | ||
$ref: '#/components/schemas/GenerateRequest' | ||
responses: | ||
'200': | ||
description: generated text stream | ||
content: | ||
text/event-stream: | ||
schema: | ||
$ref: '#/components/schemas/GenerateStreamResponse' | ||
'422': | ||
description: Input validation error | ||
content: | ||
text/event-stream: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Input validation error | ||
'424': | ||
description: Generation Error | ||
content: | ||
text/event-stream: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Request failed during generation | ||
'429': | ||
description: Model is overloaded | ||
content: | ||
text/event-stream: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Model is overloaded | ||
'500': | ||
description: Incomplete generation | ||
content: | ||
text/event-stream: | ||
schema: | ||
$ref: '#/components/schemas/GenerateErrorResponse' | ||
example: | ||
error: Incomplete generation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finish_reason should be an enum