-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gateway error when processing non-200 model response #722
Comments
|
Here is the file where the error is occurring: This method does not have error handling: export const VertexAnthropicChatCompleteStreamChunkTransform: (
response: string,
fallbackId: string,
streamState: Record<string, boolean>
) => string | undefined = (responseChunk, fallbackId, streamState) => {
let chunk = responseChunk.trim();
if (
chunk.startsWith('event: ping') ||
chunk.startsWith('event: content_block_stop') ||
chunk.startsWith('event: vertex_event')
) {
return;
}
if (chunk.startsWith('event: message_stop')) {
return 'data: [DONE]\n\n';
}
chunk = chunk.replace(/^event: content_block_delta[\r\n]*/, '');
chunk = chunk.replace(/^event: content_block_start[\r\n]*/, '');
chunk = chunk.replace(/^event: message_delta[\r\n]*/, '');
chunk = chunk.replace(/^event: message_start[\r\n]*/, '');
chunk = chunk.replace(/^data: /, '');
chunk = chunk.trim();
const parsedChunk: AnthropicChatCompleteStreamResponse = JSON.parse(chunk);
if (
parsedChunk.type === 'content_block_start' &&
parsedChunk.content_block?.type === 'text'
) {
streamState.containsChainOfThoughtMessage = true;
return;
}
if (parsedChunk.type === 'message_start' && parsedChunk.message?.usage) {
return (
`data: ${JSON.stringify({
id: fallbackId,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: parsedChunk.message?.usage,
provider: GOOGLE_VERTEX_AI,
choices: [
{
delta: {
content: '',
},
index: 0,
logprobs: null,
finish_reason: null,
},
],
usage: {
prompt_tokens: parsedChunk.message?.usage?.input_tokens,
},
})}` + '\n\n'
);
}
if (parsedChunk.type === 'message_delta' && parsedChunk.usage) {
return (
`data: ${JSON.stringify({
id: fallbackId,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: '',
provider: GOOGLE_VERTEX_AI,
choices: [
{
index: 0,
delta: {},
finish_reason: parsedChunk.delta?.stop_reason,
},
],
usage: {
completion_tokens: parsedChunk.usage?.output_tokens,
},
})}` + '\n\n'
);
}
const toolCalls = [];
const isToolBlockStart: boolean =
parsedChunk.type === 'content_block_start' &&
!!parsedChunk.content_block?.id;
const isToolBlockDelta: boolean =
parsedChunk.type === 'content_block_delta' &&
!!parsedChunk.delta.partial_json;
const toolIndex: number = streamState.containsChainOfThoughtMessage
? parsedChunk.index - 1
: parsedChunk.index;
if (isToolBlockStart && parsedChunk.content_block) {
toolCalls.push({
index: toolIndex,
id: parsedChunk.content_block.id,
type: 'function',
function: {
name: parsedChunk.content_block.name,
arguments: '',
},
});
} else if (isToolBlockDelta) {
toolCalls.push({
index: toolIndex,
function: {
arguments: parsedChunk.delta.partial_json,
},
});
}
return (
`data: ${JSON.stringify({
id: fallbackId,
object: 'chat.completion.chunk',
created: Math.floor(Date.now() / 1000),
model: '',
provider: GOOGLE_VERTEX_AI,
choices: [
{
delta: {
content: parsedChunk.delta?.text,
tool_calls: toolCalls.length ? toolCalls : undefined,
},
index: 0,
logprobs: null,
finish_reason: parsedChunk.delta?.stop_reason ?? null,
},
],
})}` + '\n\n'
);
}; |
Thanks for reporting this @mkrueger12 and thanks for being so detailed in the description!! Usually no provider returns an error in a chunk, so there is no error handling done here, but it's google, they always gotta do something weird with their API standards, xd. |
What Happened?
Issue Description
The gateway fails to properly parse error messages when accessing Claude models on GCP Vertex AI via the streaming endpoint. The issue occurs when the model returns a non-200 response.
Environment
/v1/chat/completions
anthropic.claude-3-5-sonnet@20240620
Configuration
Error Output
Gateway logs:
Application logs:
Root Cause Analysis
Additional Notes
What Should Have Happened?
The gateway should return an appropriate error response to the application.
Hitting the GCP api directly returns:
Relevant Code Snippet
No response
Your Twitter/LinkedIn
https://www.linkedin.com/in/maxkrueger1/
The text was updated successfully, but these errors were encountered: