Skip to content

Bedrock instrumentation strips quotes from JSON strings in ConverseStream tool_use responses (v1.31.0/0.52b0+) #3537

@sightseeker

Description

@sightseeker

Describe your environment

OS: macOS and Debian 12 (confirmed on both)
Python version: Python 3.13
Package version: opentelemetry-python-contrib v1.31.0/0.52b0 or later

What happened?

When using Amazon Bedrock's ConverseStream API with tool_use functionality, JSON parsing errors occur during response processing. The issue is caused by the _decodetool_use function in bedrock_utils.py incorrectly handling double-quoted strings in chunked responses, removing necessary quotes and breaking JSON structure.

Steps to Reproduce

import json

import boto3
from opentelemetry.instrumentation.botocore import BotocoreInstrumentor

# Initialize instrumentation
BotocoreInstrumentor().instrument()

# Use Bedrock ConverseStream API with tool_use
client = boto3.client("bedrock-runtime", region_name="us-east-1")

response = client.converse_stream(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",  # or any model supporting tools
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "text": "Use the get_cities_list tool to provide exactly 10 popular tourist cities in Japan. Call the tool with a cities array containing: Tokyo, Osaka, Kyoto, Hiroshima, Nara, Yokohama, Sapporo, Fukuoka, Sendai, and Nagoya."
                }
            ],
        }
    ],
    toolConfig={
        "tools": [
            {
                "toolSpec": {
                    "name": "get_cities_list",
                    "description": "Get a list of cities",
                    "inputSchema": {"json": {"type": "object", "properties": {"cities": {"type": "array", "items": {"type": "string"}}}}},
                }
            }
        ]
    },
)


res = ""

# Process the streaming response - error occurs here
for chunk in response["stream"]:
    if "contentBlockDelta" in chunk:
        delta = chunk["contentBlockDelta"]["delta"]
        if "toolUse" in delta:
            res += delta["toolUse"].get("input")

# Parse the output and print it
print(json.loads(res))

Expected Result

The ConverseStream API with tool_use should work seamlessly with OpenTelemetry instrumentation. Chunked responses should be properly processed without JSON parsing errors, maintaining the integrity of JSON strings across chunk boundaries.

For example, when a complete response is {"cities":["Tokyo","Osaka","Kyoto","Hiroshima","Nara","Yokohama","Sapporo","Fukuoka","Sendai","Nagoya"]} and it's split across chunks, the final assembled JSON should remain valid regardless of how the chunks are divided.

Actual Result

JSON parsing error occurs when chunked responses contain complete quoted strings.

Specific problem:

  • Expected complete JSON: {"cities":["Tokyo","Osaka","Kyoto","Hiroshima","Nara","Yokohama","Sapporo","Fukuoka","Sendai","Nagoya"]}
  • When delta chunks contain complete quoted strings like: "Tokyo", "Osaka", "Kyoto", etc.
  • The _decodetool_use function strips the quotes: Tokyo, Osaka, Kyoto, etc.
  • Final assembled JSON becomes: {"cities":[Tokyo,Osaka,Kyoto,"Hiroshima",Nara,"Yokohama",Sapporo,"Fukuoka",Sendai,"Nagoya"]} ← Invalid JSON
  • Result: JSON parsing fails due to multiple unquoted string values in the array

Root cause location: instrumentation/opentelemetry-instrumentation-botocore/src/opentelemetry/instrumentation/botocore/extensions/bedrock_utils.py

The issue occurs inconsistently depending on chunk boundaries:

  • Works: Chunks like {"cities":["Tok, yo","Osa, ka","Kyoto"]} (quotes not stripped due to incomplete strings)
  • Fails: Chunks like "Tokyo", "Osaka", "Kyoto" (complete quoted strings get quotes stripped)

Additional context

  • This is a regression introduced in v1.31.0/0.52b0
  • The same code works correctly with versions prior to v1.31.0/0.52b0
  • Confirmed on multiple platforms: macOS and Debian 12
  • The issue only affects ConverseStream API with tool_use functionality
  • Regular Bedrock API calls (non-streaming) are not affected
  • The problem impacts applications that rely on tool calling with streaming responses

The bug appears to be in the string parsing logic of the _decodetool_use function, which doesn't properly handle the case where a complete quoted string appears as a single chunk in the streaming response.

Would you like to implement a fix?

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions