Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

xyc · 2024-03-04T23:01:14Z

First thank you for this great project!

I was wondering if for OpenAI compatible chat completion endpoint, the streaming responses should return the same completion id (chatcmpl-) for each chunk.

For example the chat completion ids are different (chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey, chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724) in the following response:

{"choices":[{"delta":{"content":"Under"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-2EHCQqsRzdOlFskNehCMu2oOMTXhSjey","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

{"choices":[{"delta":{"content":"stood"},"finish_reason":null,"index":0}],"created":1709591108,"id":"chatcmpl-Cm7q0Ru5uEGVlW4r6cZaGNQrlS7oF724","model":"gpt-3.5-turbo","object":"chat.completion.chunk"}

...

Example NodeJS code that generates the above chunks:

import OpenAI from "openai";

process.env["OPENAI_API_KEY"] =
  "no-key";

const openai = new OpenAI({
  baseURL: "http://127.0.0.1:8080/v1",
  apiKey: "no-key",
});

async function main() {
  const stream = await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{ role: "user", content: "Say this is a test" }],
    stream: true,
  });
  for await (const chunk of stream) {
    process.stdout.write(JSON.stringify(chunk));
  }
}

main();

This is fine for nodejs server side generation, but if I stream the HTTP response and consume it with OpenAI's NodeJS SDK, I would get this error: missing finish_reason for choice 0. It seems that when a different chunk id is supplied here, #endRequest is called prematurely and the correspondent chunk would not have a finish_reason.

Example client side code:

import fetch from 'node-fetch';
import { ChatCompletionStream } from 'openai/lib/ChatCompletionStream';

fetch('http://localhost:3000', {
  method: 'POST',
  body: 'Tell me why dogs are better than cats',
  headers: { 'Content-Type': 'text/plain' },
}).then(async (res) => {
  // @ts-ignore ReadableStream on different environments can be strange
  const runner = ChatCompletionStream.fromReadableStream(res.body);

  runner.on('content', (delta, snapshot) => {
    process.stdout.write(delta);
    // or, in a browser, you might display like this:
    // document.body.innerText += delta; // or:
    // document.body.innerText = snapshot;
  });

  console.dir(await runner.finalChatCompletion(), { depth: null });
});

The text was updated successfully, but these errors were encountered:

ggerganov · 2024-03-05T07:57:16Z

Yes, I guess so. I'm not familiar, but if the original OAI API returns the same id, then we should emulate this

xyc added the bug-unconfirmed label Mar 4, 2024

xyc changed the title ~~consistent chat completion id in openai compatible server endpoint~~ consistent chat completion id in openai compatible chat completion endpoint Mar 4, 2024

xyc changed the title ~~consistent chat completion id in openai compatible chat completion endpoint~~ Consistent chat completion id in OpenAI compatible chat completion endpoint Mar 4, 2024

ggerganov added good first issue bug and removed bug-unconfirmed labels Mar 5, 2024

This was referenced Mar 5, 2024

server: maintain chat completion id for streaming responses #5880

Closed

server: maintain chat completion id for streaming responses #5988

Merged

ggerganov closed this as completed in #5988 Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

xyc commented Mar 4, 2024 •

edited

Loading

ggerganov commented Mar 5, 2024

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

Consistent chat completion id in OpenAI compatible chat completion endpoint #5876

Comments

xyc commented Mar 4, 2024 • edited Loading

ggerganov commented Mar 5, 2024

xyc commented Mar 4, 2024 •

edited

Loading