Skip to content

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
xyc opened this issue Mar 4, 2024 · 1 comment · Fixed by #5988
Closed

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

xyc opened this issue Mar 4, 2024 · 1 comment · Fixed by #5988
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@xyc
Copy link
Contributor

xyc commented Mar 4, 2024

First thank you for this great project!

I was wondering if for OpenAI compatible chat completion endpoint, the streaming responses should return the same completion id (chatcmpl-) for each chunk.

For example the chat completion ids are different (chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey, chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724) in the following response:

{"choices":[{"delta":{"content":"Under"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

{"choices":[{"delta":{"content":"stood"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

...

Example NodeJS code that generates the above chunks:

import OpenAI from "openai";

process.env["OPENAI_API_KEY"] =
  "no-key";

const openai = new OpenAI({
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "no-key",
});

async function main() {
  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{ role: "user", content: "Say this is a test" }],
    stream: true,
  });
  for await (const chunk of stream) {
    process.stdout.write(JSON.stringify(chunk));
  }
}

main();

This is fine for nodejs server side generation, but if I stream the HTTP response and consume it with OpenAI's NodeJS SDK, I would get this error: missing finish_reason for choice 0. It seems that when a different chunk id is supplied here, #endRequest is called prematurely and the correspondent chunk would not have a finish_reason.

Example client side code:

import fetch from 'node-fetch';
import { ChatCompletionStream } from 'openai/lib/ChatCompletionStream';

fetch('http://localhost:3000', {
  method: 'POST',
  body: 'Tell me why dogs are better than cats',
  headers: { 'Content-Type': 'text/plain' },
}).then(async (res) => {
  // @ts-ignore ReadableStream on different environments can be strange
  const runner = ChatCompletionStream.fromReadableStream(res.body);

  runner.on('content', (delta, snapshot) => {
    process.stdout.write(delta);
    // or, in a browser, you might display like this:
    // document.body.innerText += delta; // or:
    // document.body.innerText = snapshot;
  });

  console.dir(await runner.finalChatCompletion(), { depth: null });
});
@xyc xyc changed the title consistent chat completion id in openai compatible server endpoint consistent chat completion id in openai compatible chat completion endpoint Mar 4, 2024
@xyc xyc changed the title consistent chat completion id in openai compatible chat completion endpoint Consistent chat completion id in OpenAI compatible chat completion endpoint Mar 4, 2024
@ggerganov
Copy link
Member

Yes, I guess so. I'm not familiar, but if the original OAI API returns the same id, then we should emulate this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
2 participants